2026-03-19 - lelouvincx's second brain

Notes

Usually in data modeling-related tickets like this, to reproduce, we require:
- The how-to question.
- The dataset schema (which model, relationships, available dimensions/metrics).
- The expected AQL explore.
Usually in debugging a dbt test warning/failure:
- The error / signal.
- Which model? What are upstream models (to know the root cause can come from) and downstream models (to know affected models)?
- Query real data.
- Find out root cause. This may vary not only one root cause but many.
- Categorize root causes:
  - Data pipeline side
  - Source side
- Some common questions:
  - The test is testing what behavior? What does the test expect and not expect?
  - If we don’t feel the test make sense, question why is it there from first day?
In [[Oxygen Not Included]], there is a DLC that introduces a new kind of labor: Bionic Dupes.
- It’s like introducing robots into the types of labor, beside native dupes.
- https://media.secondbrain.lelouvincx.com/2026/03/6254fbf630764a4fe494a585a5b99d4d.png
- The robots come with some advantages:
  - Consume less oxygen
  - Survive better in harsh environment
- But disadvantages:
  - A lot of power energy
  - Extract gunk / waste water
- With those disadvantages, hard to say we should use the robots in early game, because power is scarce in early game.
- But in mid game, the robots shows excellent in far-from-base missions. For example: oil biome, space.
- It’s like the AI coming into this era. The era I’m living in.
- When robots are doing their capabilities well, the human should as well.
- My thoughts from ((69bb6817-761c-4303-afdb-2bc2af90b150))
Meaning of 1on1 meetings with manager
- Chance to nurture the team/organization’s human resource.
Recently I realize the ones who treat their work on company like their own thing is the successful one.
- That said, “làm việc bằng cái tâm”.
- Because when you treat it as your work, non-of-my-business does not exist.
- Have better responsibility, better empathy, better quality.
- Quality in everything.
Tasks
Partly today
- DONE Ingest full Calendly data into BigQuery
Context
- This project is to ingest full Calendly data into BigQuery
- Currently, there are demo and onboarding calls are ingested, but not 1-1, case study, customer success calls.
- This project is suspended for a long time ago.
- I am the PIC. But now I forget almost everything about it.
- Let’s better treat this project as if I start over.
Trigger
- Growth team needs.
  - Q: Growth team needs call data for what use case?
    - Sales rep performance.
    - Categorize leads (call-first or trial-first).
- Although not high priority.
- Since I’m coming back from higher priority projects (AI benchmarking and Add2Cart), it’s time to do this.
At Holistics, we actually have another data source for call recordings, which is Clari/ReadAI.
- This is a project that Thuan is actively working on it.
- Q: Am I overlap his work?
- Ampcode
- Let’s firstly remind the purpose / role of Calendly data within the big picture of our data platform.
  - Q: What is Calendly used for?
    - A:
      - Counting the number of calls per sales reps => Vincent wants to know how many calls each rep take a week?
      - https://media.secondbrain.lelouvincx.com/2026/03/66b02fb3a32b691cee0255245b5d988f.png
      - Ingredient for calculating sales motion (call-first / trial-first).
  - Q: Who use?
    - A: Growth team (Vincent). He needs visibility into rep activity (calls made, follow-ups done).
  - Q: How important?
    - A: Medium-low.
    - There’s a known limitation that compounds the challenge: Calendly does not auto-mark meetings as no-shows, and Hubspot’s “Last Meeting Interaction” is unreliable because it requires reps to manually mark meeting outcomes as “Completed.”
  - Q: Does it overlap on anything with ReadAI project?
    - A: For scheduled calls, Calendly is more trusted than ReadAI because there are some calls not being recorded or private, that ReadAI cannot ingest.

Plan

Define ingestion schema, what we need.
The current pipeline is Calendly => Zapier => Google Sheets => Data Import => BigQuery
- This only captures created events, for cancelled/rescheduled events we need another Zap for this
- This has only 400 events, while in total we have 4438 events (since 2016)
- Hard to control/debug when sync error
Suggest a better tool to ingest Calendly: either Prefect / Airbyte.
Here is the whole schema taken from openapi: https://dbdiagram.io/e/69d61041808962968445981c/69d62f0f8089629684479d06
Required data from these tables: event_types, events, cancellations, invitees, event_guests, users, invitee_questions_and_answers
These’s event_types data that cannot be extracted from Zapier.
Currently, to prevent blocker, I’m using Fivetran to sync data. In the future we can upgrade Airbyte and switch, nothing would breaks since we already know the schema from official Calendly API.
Waiting for anh Dong to review document and code. There is some files I don’t like in the code but it’s fine for now.
OpenAPI spec: https://stoplight.io/api/v1/projects/calendly/api-docs/nodes/reference/calendly-api/openapi.yaml
When do data pipeline, ingesting phase, should find for something like an openapi like this for source schema reference. This is important because when the source API change, we have reference to update our data code accordingly.
Anh Dong has reviewed, my job to answer reviews.
Re-model into fact bookings and dim event, while fact booking’s grain = event x invitee x rep.
PR approved https://github.com/holistics/dbt/pull/853, next step is to run dbt models and continue in Holistics BI.
Update reporting layer:
- Model
- Dataset
- Dashboard

Layer	Model	Alias
Source	`src_calendly` (7 Fivetran tables) + `src_gsheets.calendly__event_type_categories` (mapping seed)
Staging	`stg_calendly__events`, `stg_calendly__event_invitees`, `stg_calendly__event_hosts`, `stg_calendly__event_guests`
Domain	`dom_calendly__bookings` (grain: event × attendee × rep)
Mart	`mart_growth__calendly_bookings` (view, `fct_calendly_bookings`)	Fact table
Mart	`mart_growth__calendly_events` (table, `dim_calendly_events`)	Dimension
Deprecated	`mart_growth__demo_calls`, `mart_growth__onboarding_calls`, `mart_core_business__demo_call_bookings`, `dom_calendly__demo_call_bookings`, `dom_calendly__onboarding_call_bookings` — all `enabled: false`
Fixed	`mart_core_business__sales_leads` — now joins `dom_calendly__bookings` directly on `attendee_email_domain`

((69a2a1cc-fc1f-4802-ba56-48183fa9e9a5))
TODO Think about applying AI into resolving #data-ops-bot issues (data pipelines)
What we do:
- Pick a task
- Resolve it manually
- After done, write a very detailed guide (so that a guy with minimal knowledge about Holistics can still resolve himself)
- Put it to agents to try
- Loop
After that, we can materialize it as a skill.
Plug it into the agent, tag on slack whenever a bug appears.
Improve it incrementally.
Issues in ((69b3d030-98dd-450d-88d8-e00c1c322c59)) can be resolve easily and fast.
- Better way to do: Delegate to juniors: Thuan / agents.
  - For agents:
    - Write a very detailed step-by-step guidance for the task.
    - Let the agent attempt.
    - Note down learning points => improve => release the first version of agent skill.
  - For Thuan (will need advice here, here is just a simple process I come up with):
    - Pair programming.
    - Observe how he solves the problem.
    - Note down teaching points, learning points.
Done
- DONE Support Sasha mybacs
DONE Support Sasha mybacs
- relative_period does not support future time?
- Since AQL doesn’t support a dynamic rolling window where N changes per row, We split this into 4 separate conditional demand metrics using case + where, then combine them. Each metric contributes demand only for rows matching its product group AND whose month falls within the correct future window.
DONE Add shadcn/ui skills to speed up frontend development

Chinh (lelouvincx) / 2026-03-19

Notes

Tasks

Partly today

Done