Chinh (lelouvincx) / Weekly Report - 2026-W14

Created Mon, 25 May 2026 00:00:00 +0000 Modified Mon, 25 May 2026 06:02:25 +0000
2219 Words

Weekly Report — Mar 30 – Apr 5, 2026 (friday mode)

Work

Add2Cart

  • DONE Eliminate bridge table (bridge_product_retailer) from Redshift pipeline and Holistics AML layer
    • Completed all 5 phases of the migration plan (Slack thread):
      • Phase 0: Snapshot baseline tables (_migration_baseline, _bridge_backup) created on Redshift.
      • Phase 1: Added retailer_id + retailer_current_pricing_skey columns to atc_price_history, backfilled ~27M rows (75.7% coverage). Known gap: 8,713 keys from Chemist Discount Center slug mismatch — documented.
      • Phase 2: Deployed 5 updated SPs (sp_refresh_daily_prices, sp_refresh_atc_price_history_enrich, sp_run_all, sp_refresh_price_history_from_s3, sp_validate). Ran sp_run_all() — enriched table validated at 12.1M rows, join path works without bridge.
      • Phase 3: 7 commits on feat/eliminate-bridge-aml — removed bridge from all 4 datasets + market_price_rank model, added direct rcp_skey relationships, deleted agg_retailer_product_price model. Pushed to Holistics, dashboards verified.
      • Phase 4 (soak): Monitored dashboards end-to-end through bridgeless path — no issues.
      • Phase 5 (cleanup): Dropped bridge/agg tables + SPs, deleted bridge AML model, cleaned up migration artifacts. PR #2 merged. Shared completion summary with Anurag in Slack. Bridge kept as _bridge_product_retailer_backup (frozen, 222K rows) for reference. Remaining data quality items (near-miss slugs, missing retailers, history-only) handed off to Anurag.
    • Impact: Simplified join path from master_product → retailer_current_pricing → bridge → atc_price_history to master_product → retailer_current_pricing → atc_price_history (one fewer hop), improving query performance.
  • WAITING Guidance for countries-level data ingestion (Notion D4 doc)
    • Investigated multi-country retailer ingestion from 6 RDS databases (airbyte_schema_*). Found critical price_history_skey collision: 2,774 keys collide across countries (39,710 rows), mainly Watsons across SG/ID/MY.
    • Decision: Holistics will not implement the ingestion. Created D4 guidance document with two approaches evaluated: (1) Dynamic Schema (separate Redshift schemas per country, switch via User Attributes) — recommended for Add2Cart; (2) Consolidated Schema (UNION ALL + RLP). Added Mermaid diagrams, comparison table, collision detection SQL, SP impact matrix, and AML examples.
    • Feedback: initial version was too detailed on implementation. Rewrote to lead with “what needs to be done” and high-level approaches first. Waiting for anh Huy to review.
    • Aligned with Anurag on Slack: he ingested retailers for all countries into Redshift.

Presales

  • DONE Study Jonas Chorum onboarding call 1 (Notion prep note)
    • Hotel PMS company (~1,000 properties), non-multi-tenant (each customer = separate DB on self-hosted SQL Server). Key use cases: natural language querying over reservations, embedded analytics as paid add-on, dynamic data source for per-hotel routing.
    • Answered tunnel/bastion server question in Slack — they ran Windows script instead of Linux, connection error escalated to support team (Tien).
    • anh Huy led the call, I sat aside to assist. Decision from team review: known limitation on cross-database aggregation, need to clarify further with customer.
  • DONE Refactor Erin (Showbie) dashboard — Chukwudi had a fever, asked for help via Slack to refactor from query models to decoupled AQL metrics. Huddled with anh Dong, then handed over to him for the call.
  • DONE Onboarding call 2 with Basata (Taha) (ReadAI transcript, Slack debrief)
    • Good sentiment with Taha. Key feedback: creating single dashboard + embed portal flow isn’t seamless, needs to be easier to understand. Question: “any best practices to make the embedded dataset easier for business users to use?” — introduced Custom View feature. Embed Portal vs Single Dashboard distinction remains a point of confusion.
    • Embed portal tutorial already delivered: Notion guide, shared in #presales-sa Slack. Taha’s backend colleague to begin portal embedding.
  • TODO Onboarding call 1 with Jonas Chorum — vibe-coding a dynamic data source demo. Built POC showing data_source_name in AML routing per-hotel queries via JWT user attributes. Limitation identified: cross-database aggregation (50-60 DBs) requires a data mart. Moved to Apr 6.
  • TODO Polish embedding documentation — formalized as P2 backlog item. Will ask anh Tai and anh Huy next week to clarify scope and whether to own this. Friction observed across Showbie, Superbexperience, Basata, and Jonas Chorum.
  • NOTE Presales capacity: anh Huy is deliberately stepping back from leading calls, delegating to team (me, anh Dong, Chukwudi, Mario) to build team capacity. The expectation is thorough preparation before every call.
  • NOTE Presales load continues to increase: Jonas Chorum, Showbie, Basata active this week. The embedding use case is becoming the dominant pattern — 4 of last 5 customers need embedded analytics.

Internal

  • DONE Thuan’s PR #845 (DE-208 / mart_product__dataset_datamodel_dimensions) — MERGED. Fix for PARSE_JSON syntax error. All CI checks passed after review rounds in W13.
  • DONE Review DE-206 for anh Hieu — exchange rate document review completed. Notion doc and Slack thread.
  • DONE Visualize schedule of all data lineage (Notion: Data Pipelines Schedules)
    • Created full Mermaid flowchart of the entire pipeline: Extract & Load (21:30–23:55 VNT) → Domain Transforms (00:20–00:55) → Mart Transforms (00:15–01:35) → Post-Transform (01:35–08:00). Mapped all cron schedules from prefect.yaml, dbt model dependencies via ref() calls, and Airbyte sync frequencies.
    • Key finding: mart_event depends on mart_product (specifically mart_product__users and mart_product__dashboard_widgets), but runs weekly on Monday 00:15 before the daily product mart at 01:10. Safe because mart_event doesn’t use the + prefix (reads existing BigQuery data, no upstream rebuild).
  • DONE Check Zendesk tenant Inovonics (DAT-571)
    • Investigated why tenant US-1099511640417 wasn’t mapped to Zendesk org in analytics. Traced through itg_mappings__zendesk_tenant → HubSpot domain mapping → Zendesk organizations table.
    • Root cause: Airbyte Zendesk source connector (source-zendesk-support:0.2.6) has a pagination bug capping ingestion at 10,000 records. The raw organizations table had exactly 10,000 rows; 680 of 911 org IDs in tickets were missing.
    • Recommendation: Upgrade Airbyte Zendesk connector to fix pagination.
  • DONE Add CI to validate Holistics AML project (PR #67 — MERGED)
    • Triggered by a silent semantic conflict: PR #63 (Hieu) added a dashboard depending on domain_hubspot_companies, and PR #64 (Thuan) deleted that model. Git didn’t flag it because they touched different files.
    • Implemented GitHub Actions workflow using Holistics Validation API to validate AML syntax and references on every push.
  • DONE Attended Product Office Hours P1 and P2 (Notion newsletter)
    • Noted: AI theme builder tool at holistics.h-theme-builder.pages.dev/theme-builder — will try with more prospects.
    • New dataset exploration UI looks promising.
    • Data team should contribute to Holistics skills and internal-skills repos.
  • TODO Create agent skill for fixing dbt data pipelines for #data-ops-bot — carry-over from W13, not started.
  • TODO Fix excluding internal testing Zoho accounts (DAT-524) — carry-over since W7, not started.
  • TODO Calendly data pipeline (DAT-283) — carry-over, not started.

Docs

  • TODO Polish embedding documentation — increasing embedding leads make this urgent. Same item tracked under Presales.
  • WAITING Add demo video for local development docs — waiting for team to trigger.

Logseq

  • DONE Add project glossary (PR #13) — created canonical project list in pages/Projects.md.
  • DONE Improve backlog structure and automation query pre-processing (PR #12).

Personal / Tooling

  • DONE Changed to a more professional work avatar.
  • DONE Edit video last Hue Trip — completed on Saturday.
  • DONE Sao kê (personal finance task).
  • DONE Settled on mochi.cards as Flashcard app for learning English words.
  • Installed Annotate — free on-screen annotation tool for customer demos and screen recordings. Lightweight alternative to paid tools.
  • Discovered Pebble Index 01 — smart ring for quick voice notes. Interesting for capturing ideas on the go.
  • Found Vietnam Real Estate Dataset on HuggingFace — potential hobby data analysis project.
  • Note: Logseq automation is compounding — helps retain historical context across projects, improving performance in sync calls and connecting dots. The second reason for feeling productive this week.

Learning & Notes

  • Watched How to Present a MIND-BLOWING Software Demo — key takeaways: recap slides (fact → problems → criteria → new findings), speak customer’s language (airline = passengers, SaaS = users), confirm what they care about before demo, use cases > features, have an assist person, send personalized recap to each audience member. Applied learning to Jonas Chorum call prep.
  • Read How to teach technical concepts with cartoons by Julia Evans — guide on visual teaching. Potential application for Kindle Scribe. Follow-up from W13 recommended resource on “implementation challenges.”
  • Read Details aren’t the problem. The problem is too many of the wrong details by Wes Kao — levels of detail in communication. Completed from W13 recommended resources backlog.
  • LEARNING Single-tenant vs Multi-tenant architecture patterns — relevant context for Jonas Chorum (single-tenant: infra duplicated per tenant) vs most other customers (multi-tenant: shared infra, logical isolation). Understanding this distinction is critical for presales scoping.
  • LEARNING Country names follow a people + land pattern: Iceland, Greenland, England, Switzerland, Finland, Kazakhstan, Uzbekistan (stan = land).
  • NOTE Good observation from anh Huy: “If I just stay in the call, everyone can’t improve. The only thing that makes sense now is to force the team to prepare very well before the call.” Leadership by stepping back.
  • NOTE Increase Calendly pipeline task to top priority next week — noted on Thu.

Next Week

  • P1 — Add2Cart: Project retro with anh Dong — review the full bridge elimination + countries guidance work.
  • P1 — Presales: Onboarding call 1 with Jonas Chorum — finalize dynamic data source demo, address cross-database aggregation limitation.
  • P1 — Internal: Calendly data pipeline (DAT-283) — elevated to P1 per Thu note. Standardize source pipeline and build unified dbt models.
  • P2 — Internal: Create agent skill for fixing dbt data pipelines in #data-ops-bot — carry-over from W13.
  • P2 — Internal: Fix excluding internal testing Zoho accounts (DAT-524) — lingering since W7. Timebox 2h or explicitly deprioritize.
  • P2 — Internal: Follow up on DAT-571 (Zendesk Airbyte connector upgrade) — ensure pagination fix is scheduled.
  • P2 — Docs: Polish embedding documentation — ask anh Tai and anh Huy to clarify scope. Write step-by-step embed portal setup guide.
  • P2 — Add2Cart: Get anh Huy’s review on the D4 countries guidance document.
  • P3 — Presales: Read Modeling Patterns docs.
  • P3 — Personal: Self reflection and update CV.

Career & Personal Consulting

Progress Review (Start/Stop/Keep):

  • Start: Creating a reusable “embedded analytics scoping checklist” — you’ve now handled 5+ customers with embedding needs (Basata, Superbexperience, Showbie, Innerspace, Jonas Chorum). Each one follows the same pattern: data source setup → modeling → RLP/JWT → white-labeling. Document this as a repeatable template.
  • Start: Proactively adding CI/safety nets — the AML validation CI (PR #67) caught a real problem (silent semantic conflict). This kind of infrastructure work prevents future firefighting. Look for similar opportunities.
  • Stop: Writing overly detailed technical documents on first pass — the D4 guidance doc got feedback that it was “too detailed unnecessarily.” Lead with what needs to be done, then add implementation detail as appendix. Apply the Wes Kao “levels of detail” framework you just read.
  • Keep: The observe-then-lead pattern for presales calls — Jonas Chorum (Huy leads, you assist) → Basata (you co-lead) → eventually you lead solo. This is the right progression.
  • Keep: Delivering guidance documents instead of doing the work yourself (D4 for Add2Cart, embed portal tutorial for Basata) — this is staff-level behavior. Scope the problem, identify risks, hand off with a clear plan.

Observations:

  • This was a high-output week: bridge elimination completed (Phases 0–4), countries guidance delivered and rewritten, Erin dashboard handed off, Basata call 2 done, data lineage visualized, Zendesk root cause found, AML CI added, PR #845 merged. The shift from “doing tasks” to “closing projects and handing off” is a maturing pattern.
  • The presales embedding pattern is becoming your specialty. 4 of the last 5 customer interactions centered on embedded analytics. The Basata feedback (“creating single dashboard + embed portal flow isn’t seamless”) is product-level insight — escalate to product team.
  • The Zendesk investigation (DAT-571) uncovered a systemic issue: 680 of 911 org IDs missing due to Airbyte pagination bug. This is the kind of data quality root cause analysis that prevents months of downstream confusion. Good instinct to trace the full lineage.
  • The carry-over list is shrinking this week (DE-206 done, data lineage done, Erin done) — but DAT-524 (W7) and agent skill (W13) remain. Calendly is now elevated to P1. Good prioritization awareness.
  • Positive signal: learning → application cycle is fast (demo video → Jonas Chorum prep; Wes Kao article → D4 rewrite feedback). This suggests high ROI on continued presales skill investment.

Embedded Analytics (your emerging specialty):

Presales & Demo Skills (continuing from W13):

  • Book: The Trusted Advisor by David Maister — carry-over from W13. Still highly relevant as you transition from demo-runner to strategic advisor for embedding customers.
  • Article: The 3-2-1 Speaking Trick — concise communication framework. Complements the Wes Kao article you just read on levels of detail.

Data Engineering & Pipeline Observability (your core craft):

  • Tool: Elementary — dbt-native data observability. Carry-over from W13. Directly relevant to the agent skill for #data-ops-bot.
  • Article: dbt Best Practices — official guide updated for 2026. Covers structuring projects, materialization patterns, and CI workflows. Relevant for the agent skill and mentoring.
  • Tool: Dagster — modern orchestration with asset-centric approach. Worth evaluating if you’re rethinking the data lineage visualization problem (alternative to Google Stitch).

Career Growth (sustaining momentum):

  • Book: Staff Engineer by Will Larson — carry-over from W13. The bridge elimination project (scoping → executing → documenting → handing off) is textbook staff-level work. This book helps you articulate that trajectory.
  • Article: The Engineer/Manager Pendulum by Charity Majors — carry-over from W13. Your role blend (IC engineering + presales + mentoring) is exactly the pendulum she describes.