Avatar
🧠

Organizations

  • User Context & Persona

    • Professional Profile

      • The user is a Data Analytics Engineer and Educator. They maintain a technical site at lelouvincx.com.
      • They are highly technical, preferring detailed, engineering-focused discussions over high-level overviews.
      • They are working at holistics.io
    • Technical Stack & Expertise

      • Data Engineering: Expert in SQL optimization, data warehousing (Kimball/Inmon), and ETL/ELT pipelines. Frequently uses dbt, Dagster, DuckDB, PostgreSQL (RDS), and Redshift.
      • BI & Visualization: Experienced with Holistics, Metabase.
      • Infrastructure: Proponent of self-hosting and cost-optimization. Uses low-cost VPS providers (Contabo), Docker, Cloudflare.
      • Workflow: Uses MacOS with a keyboard-driven workflow. Preferred editor is Neovim (LazyVim config) and uses Logseq for knowledge management.
    • Active Projects (Current Context)

      • 1. [Duty] Holistics Support

        • Role: Customer Duty Support / Data Analytics Engineer at Holistics.
        • Key Context: Focuses on troubleshooting and scaling solutions for clients.
        • Recent Work:
          • Client “Pencil”: Scaling customer dashboards and optimizing data models within the Holistics platform.
          • Workflow: Synthesizing support call transcripts and drafting technical follow-up communications.
      • 2. [Add2Cart] Data Consultancy Service

        • Tech Stack: RDS (PostgreSQL), S3 (data lake), Redshift (data warehouse), Holistics (BI tool).
        • Current Status: configuring replication tasks and debugging connection/permission issues between the OLTP source and OLAP destination.
      • 3. [Teaching] Data Analytics Educator

        • Role: Instructor for a junior analyst and a junior engineer.
        • Curriculum: Developing a gamified analytics course based on the Olist E-commerce dataset.
        • Resources: Creating a “Teacher’s Guide” and phased learning materials.
        • Infrastructure: Manages a self-hosted Metabase instance for students, powered by DuckDB and Neon.tech, exposed via Cloudflare Tunnels. Expose learning materials on public website at https://ngocyen99tb.lelouvincx.com, written in Docusaurus.
      • 4. [Personal] Infrastructure & Tools

        • Blog/Docs: Revamping lelouvincx.com (formerly hugo blowfish) (focus on themes, search plugins, and sitemaps).
        • Self-Hosting: Personal infrastructure includes Contabo VPS
    • Cultural Context

      • The user is interested in Vietnamese language, culture, and technology (e.g., local banking, esports).
    page Created Mon, 25 May 2026 00:00:00 +0000
  • page Created Mon, 25 May 2026 00:00:00 +0000
    • Category: [[Database Optimization]]
    • When filter some data, instead of loading all data into memory then filter, predicate pushdown allows filtering at the query execution layer/storage layer, in order to only load the filtered data into memory.
    • Benefit:
      • Reduce I/O
      • Reduce Time
    • How it works
      • Before (without predicate pushdown)
        • Read: The engine read 100% data from storage. logseq.order-list-type:: number
        • Transfer: The engine transfers all data over the network to the compute engine. logseq.order-list-type:: number
        • Filter: The compute engine loads data into memory and applies WHERE. logseq.order-list-type:: number
      • After
        • Read: The compute engine sends the query and the filter to the storage/metadata layer. logseq.order-list-type:: number
        • Filter: The storage/metadata layer uses statistics based on the WHERE to skip entire redundant blocks of data. logseq.order-list-type:: number
        • Transfer: Smaller data transfers over the network. logseq.order-list-type:: number
    • Pitfalls
      • Casting: something like WHERE CAST(string_col AS INTEGER) > 5 will not work.
        • Because the compute engine has to read all data to do the cast before filtering (which Holistics usually use).
      • Complex UDF.
      • Unsorted: If data random across row partitions, the predicate pushdown won’t work effectively.
    page Created Mon, 25 May 2026 00:00:00 +0000
    page Created Mon, 25 May 2026 00:00:00 +0000
    • Principles

      • A triage page for heuristics gained from experience. Not the final home β€” the inbox / staging area.
      • Each entry: one-line trigger, one-line response, backlink to the journal where it was forged, and a status.
      • Once a heuristic matures (fires a 2nd time and has a clear domain), route it out to a dedicated page or upstream to a team-owned doc. The triage page stays small on purpose.
    • How to use this page

      • Capture freely. Anything I notice once gets dropped here. Low friction.
      • Promote on second use. If a heuristic fires a 2nd time in real work, it has earned a spot on a dedicated page.
      • Archive on neglect. If an entry sits in triage > 60 days without a 2nd use, archive it. The page is for active heuristics, not aspirations.
      • Re-read during Friday wrap-up. Demote anything routed-out that hasn’t fired in 90 days.
    • Routing targets

      • prioritization / scheduling β†’ [[Priority Rules]]
      • sleep / illness / movement / energy β†’ [[Health Rules]]
      • (future) work-craft pages may emerge:
        • communication β†’ [[Communication Rules]] (TBD)
        • code review / on-call β†’ [[Code Review Rules]] (TBD)
      • Some heuristics belong outside the second-brain. Upstream to team-owned docs when the rule is not unique to me β€” it converts a private heuristic into team infrastructure, and the writing exercise itself is career-visible.
    • Status legend

      • triage β€” just captured, awaiting a 2nd occurrence.
      • routed β†’ [[Page]] β€” promoted to a dedicated page; this entry kept as a stub for backlink.
      • upstream β†’ <team doc> β€” proposed / merged into a team-owned doc.
      • archived β€” did not survive a 2nd use within 60 days, or superseded.
    • Entries

      • Work craft

        • dbt deprecation 3-phase rule
          • trigger: deprecating a dbt model that other AML / BI assets reference.
          • response: phase 1 deprecate in code β†’ phase 2 remove the .sql β†’ phase 3 drop the BQ table. Don’t sit between phase 2 and 3 β€” AML references BQ physical names, so removing the .sql only stops refresh and leaves the table silently misleading consumers.
          • forged: [[Weekly Report - 2026-W17]] Β· Ampcode T-019da543
          • validated: W18 Calendly cleanup β€” 3 legacy models deleted in PR internal-aml-project#78 without re-deriving the reasoning.
          • status: upstream β†’ holistics/dbt contributing guide (proposed). Team correctness rule, should outlive me.
        • Small P1 ships before big P2 context-switch
          • trigger: a P1 escalation lands while a small / cheap P1 task is still open.
          • response: ship the 30-min item first, then context-switch. The cheapest item is the one most at risk of slipping when escalations arrive.
          • forged: [[Weekly Report - 2026-W18]] (dbt PR #858 round 1 review slipped when BuyCo + Wamly P1 escalations landed Wed).
          • status: triage. Candidate for a future on-call / escalation handling note.
      • Communication

        • Customer comms reframe β€” recipe, not IOU
          • trigger: a customer asks for help with a workflow / configuration.
          • response: frame as “here is the recipe, happy to pair if helpful” instead of “I’ll do it for you”. Volunteering hands-on work tends to get silence in return.
          • forged: [[2026-04-27]] Β· validated against W17 Showbie + Basata silence pattern.
          • validated: W18 BuyCo 14-thread triage β€” replies were structured as workarounds + screen-recording asks, not “I’ll log in and fix it for you”.
          • status: upstream β†’ presales comms playbook / CS onboarding doc (proposed). Pattern is not unique to my threads.
        • Ask for no, don’t ask for yes (Mooreds)
          • trigger: needing a small decision from a busy stakeholder (manager, peer).
          • response: replace “can we do X?” with “I’m going to do X to solve Y, will take care of it Monday unless I hear differently.” Shifts cognitive load to opt-out and unblocks progress while keeping ownership.
          • forged: [[Weekly Report - 2026-W17]] Β· mooreds.com
          • status: triage. Tone needs to be calibrated per person β€” practice on smaller decisions first.
        • Concision β€” what is the most important thing to say? (Wes Kao)
          • trigger: writing a status update, PR description, Slack message, or stakeholder note.
          • response: lead with the most important sentence. Cut anything that doesn’t change the reader’s decision or action.
          • forged: [[Weekly Report - 2026-W18]] Β· Wes Kao β€” How to Be Concise
          • status: triage. Pair with Levels of detail in communication once read.
      • Decision-making / scheduling

        • Vague task β†’ time-block + decompose to smallest concrete next step
          • trigger: a task feels blocking / overwhelming and keeps getting deferred.
          • response: time-block a slot, then force decomposition into the smallest concrete next step. Long carry-overs often have a procrastination root, not a priority root.
          • forged: [[2026-04-30]] Β· Wait But Why β€” How to Beat Procrastination
          • validated: same-day BuyCo 14-item Notion page β†’ 14 micro Slack threads in #holistics-buyco-external, all shipped in one Wed afternoon.
          • status: triage. Candidate for promotion to [[Priority Rules]] (pairs with the 3+ week carry-over forcing function) on next occurrence.
        • Highest-leverage triage on low-energy weeks β†’ routed β†’ [[Priority Rules]]
        • 3+ week carry-over forcing function β†’ routed β†’ [[Priority Rules]]
      • Reflection / craft hygiene

        • Writing is thinking β€” daily journals are 100% me
          • trigger: temptation to outsource a journal entry / reflection draft to an LLM.
          • response: don’t. LLM is fair game for boilerplate / first drafts only. The reflection muscle is the differentiator.
          • forged: [[Weekly Report - 2026-W18]] Β· HN thread
          • status: triage.
        • Cultivate non-LLM reading sources
          • trigger: defaulting to LLM search inside Logseq for ideas / craft inspiration.
          • response: deliberately seek out blogs / RSS / newsletters from individual senior writers (e.g., mooreds.com, Wes Kao, benn.substack). Signal/noise from a curated human writer is different from an aggregated answer.
          • forged: [[Weekly Report - 2026-W17]]
          • status: triage.
      • Sleep > productivity β†’ routed β†’ [[Health Rules]]
      • Illness = hard stop on day 2 β†’ routed β†’ [[Health Rules]]
      • No gaming after 18:00 on weeknights β†’ routed β†’ [[Health Rules]]
      • Eat before exercise β†’ routed β†’ [[Health Rules]]
      • Weekend discipline (1 full off-day default) β†’ routed β†’ [[Health Rules]]
    • Notes

      • Re-read this page during Friday wrap-up alongside [[Priority Rules]] and [[Health Rules]].
      • When the same heuristic shows up in 2 separate weekly reports, that’s the promotion signal β€” move it out.
      • Upstreaming to team docs is higher leverage than keeping a heuristic personal. Ask: “would Thuan / a new hire benefit from this being in the team’s docs?” If yes, draft the upstream PR.
    page Created Mon, 25 May 2026 00:00:00 +0000
    • Priority Rules

      • A living set of personal rules for prioritizing tasks. Improve incrementally.
    • Working principles

      • Choose the highest-leverage tasks even when the week’s workload is low (e.g., sick, leave, traveling). A low-energy week shouldn’t slide into a low-leverage week β€” pick fewer items, but pick the most important ones.
      • 3+ week carry-over rule β€” any item carried over 3 or more weeks gets a forcing function:
        • either elevate its priority and schedule it on a specific day, OR
        • formally move it to backlog / waiting (with the reason).
        • Goal: avoid the “always-on but never-done” zone.
    • Priority tiers

      • P0 β€” Urgent
        • Must do immediately.
        • Usually customer-facing tasks that need a response or mitigation ASAP (e.g., production incident, enterprise customer escalation, data outage).
        • Drop other work to handle these.
      • P1 β€” High
        • Should prioritize doing because some stakeholders are blocking.
        • Includes manager-elevated items and teammate-blocking reviews.
      • P2 β€” Normal
        • Important but not urgent.
        • Picked up when no P1 tasks are active.
        • Most internal data work / improvements live here.
      • P3 β€” Personal / Teaching / Free time
        • Personal projects, teaching, learning, exploration.
        • Can be done in free time, or delegated to others when possible.
    • Notes

      • Re-read this page during weekly planning (Friday wrap-up).
      • When a rule causes friction in practice, capture the friction here as a new bullet, then iterate.
    page Created Mon, 25 May 2026 00:00:00 +0000
  • Resources

      1. The Foundational “Bible” of PLG
      1. Technical Modeling
      • dbt Labs Blog - Modeling for PLG: Search for their articles on “Modeling the Customer Journey.” They have excellent deep-dives on how to handle the many-to-many relationship between Users and Accounts in SQL.
      • The Hightouch or Census Blogs (Data Activation): These companies pioneered Reverse ETL. Their blogs are specifically geared toward engineers building the “HubSpot + Product Data” loop. Look for “How to build a PQL scoring model in Snowflake/BigQuery”.
      • Hacking Data (Newsletter/Blog): Often covers the technical debt involved in stitching together SaaS tools like HubSpot and Stripe (similar to your Zoho setup).
      1. Industry Voices
      • Benn Stancil’s Substack: Benn (founder of Mode) writes deeply about the intersection of BI tools and data modeling. Since you are building a BI-related product, his insights on how users “consume” data are highly relevant.
      • Locally Optimistic Slack: A community for data leaders and engineers. There is often a lot of discussion about “GTM (Go-To-Market) Data Modeling.”
    page Created Mon, 25 May 2026 00:00:00 +0000
  • List of projects

    • [[Add2Cart]] β€” Cross-functional project for the Add2Cart professional service: data pipeline, performance optimization, dynamic markdown, and S3 setup
    • [[AI]] β€” Holistics AI squad work: test suites, AI feature trials, squad coordination with Dat/Tram
    • [[Docs]] β€” Holistics documentation improvements and revamping
    • [[Duty Support]] β€” Rotating on-call data support for Holistics customers (Zendesk tickets, impersonation, troubleshooting)
    • [[Freetime]] β€” Low-priority reading and exploration done in spare time
    • [[Internal]] β€” Holistics internal data team work: dbt, MRR reporting, CI, tracking
    • [[Logseq]] β€” Improvements to this second-brain system: automation, backlog structure, weekly reports
    • [[Misc]] β€” One-off tasks that don’t fit other projects
    • [[Personal]] β€” Personal tooling, side projects, migration, self-hosting, learning
    • [[Personal Productivity Analysis]] β€” Personal productivity monitoring: ingest weekly reports + journals into DuckDB (consistent schema with [[Personal Finance]] for later consolidation), build dashboard to track throughput, carry-overs, project mix, priority hit-rate, presales W/L, tooling ROI β€” drive data-driven planning and 1-on-1 decisions
    • [[Presales]] β€” Holistics presales support: customer demos, solution proposals, competitive analysis
    • [[Read]] β€” Reading and studying: articles, books, courses, youtube
    • [[Smartclass]] β€” Personal side project: an ed-tech app with frontend (shadcn/ui) and releases
    • [[Teaching]] β€” Mentoring and teaching data/SQL: composing projects (Olist), visualization, set-based thinking
    page Created Mon, 25 May 2026 00:00:00 +0000