The “Karpathy Graph” for Operators: Build a Business Knowledge Graph from Google Drive + Gmail (Without a Migration)

If you’ve tried to bolt “AI search” onto a real company’s Google Workspace, you’ve probably felt the gap:

Semantic search can find relevant documents, but it struggles with “what’s the latest?”
It can summarize a doc, but it struggles with “who owns this?”
It can surface a paragraph, but it struggles with “what changed since last month, and which email thread approved it?”

That’s not because embeddings are bad. It’s because most operational questions are relationship questions.

This post is a practical playbook for building a business knowledge graph from Google Drive + Gmail—incrementally, permission-aware, and without migrating your whole company into a new “knowledge base.” Internally at nNode we call this a “Karpathy graph”: a lightweight, evolving context layer that helps an AI operator behave less like a chatbot and more like a teammate.

We’ll cover:

Why vector RAG alone breaks on operator queries
A minimum viable work-graph: the nodes/edges that matter first
A build pipeline: Drive/Gmail → extraction → graph → retrieval
GraphRAG vs RAG: a decision framework
Permission-aware retrieval and approval-first execution
Two starter workflows you can ship fast

1) The problem with “AI search” in real companies

In clean demos, RAG looks like magic: chunk docs, embed, retrieve, summarize.

In actual operations, the failure modes are predictable:

Version chaos: final.pdf, final_v3.pdf, final_final_revised.pdf in three folders.
Ambiguous names: “Pricing Sheet” means something different per region or customer segment.
Email-only truth: the real terms are in a Gmail thread, not the doc.
Ownership drift: the doc exists, but nobody knows who can approve a change.
Permission realities: “just index everything” becomes “accidentally leak everything.”

The core issue: most operator questions require recency + relationships + permissions—not just similarity.

2) What a “Karpathy graph” means (in plain English)

This is not “let’s do a 6-month ontology project.”

A Karpathy graph (operator definition) is:

A private, evolving map of your business: people, customers, vendors, projects, documents, email threads, decisions, and tasks—plus the links between them.

It’s “lightweight” because:

You start with metadata + pointers (not perfect document understanding).
You accept uncertainty (confidence scores, “maybe” edges).
You build it incrementally, driven by the workflows you actually run.

In nNode terms, this graph is the assistant’s context layer—the thing that makes “blackbox mode” (works on messy Drive) possible without forcing a migration.

3) Minimum viable schema: 12 nodes that unlock ops value

You don’t need hundreds of entity types. You need just enough structure to answer:

“What’s the latest?”
“Who owns it?”
“What’s connected to this customer/vendor/project?”
“Where did this decision come from?”

Here’s a minimum viable schema that works surprisingly well.

Core entity nodes

Node	What it represents	Example properties
`Person`	Employee / collaborator	email, name, role
`Org`	Company entity	domain, legal name
`Customer`	Customer account	domain, external IDs
`Vendor`	Supplier/provider	domain, category
`Project`	Ongoing initiative	status, owner
`Product`	SKU / service	SKU, category

Artifact nodes

Node	What it represents	Example properties
`DriveFile`	Doc/Sheet/PDF/etc.	fileId, mimeType, modifiedTime, webViewLink
`EmailThread`	A Gmail thread	threadId, subject, lastMessageAt, participants
`Meeting`	A calendar event + transcript	eventId, startTime, attendees

Commitment & execution nodes

Node	What it represents	Example properties
`Decision`	“We agreed to X”	decidedAt, confidence
`ActionItem`	Task someone owes	dueDate, status
`Approval`	Explicit approval gate	approver, approvedAt, scope

This schema is intentionally operator-first: it models what the business does, not just what it knows.

4) The edges that actually matter (and how to infer them safely)

If nodes are nouns, edges are the verbs.

Start with edges you can infer cheaply and safely:

OWNS(Person → Project/DriveFile)
MENTIONED_IN(Entity → DriveFile/EmailThread/Meeting)
RELATED_TO(Customer ↔ DriveFile/EmailThread)
LATEST_VERSION_OF(DriveFile → DriveFile)
DECISION_FROM(Decision → Meeting/EmailThread)
ASSIGNED_TO(ActionItem → Person)
REQUIRES_APPROVAL(ActionItem/Change → Approval)

How to infer edges without overfitting

Use a layered approach:

Deterministic signals first
- Drive file owners, sharing ACLs
- Gmail headers (From/To/CC), thread IDs
- Calendar attendees
Lightweight extraction second
- Domains → likely Org/Customer/Vendor
- Invoice/PO numbers → linking anchors
- Explicit phrases (“approved”, “LGTM”, “go ahead”) → candidate Approval
LLM extraction last (and always scored)
- “What decision was made?”
- “Who owns the next step?”
- “Which customer is this thread about?”

If it can’t be inferred confidently, store it as a low-confidence edge and let workflows upgrade it over time.

5) Build pipeline: Drive/Gmail → extraction → graph (incremental, not brittle)

A good pipeline has two rules:

Don’t try to parse everything on day 1.
Keep the graph anchored to sources-of-truth (file IDs, thread IDs, event IDs) so it doesn’t drift.

Here’s a practical architecture:

graph LR
  A[Google Drive API] --> M[Metadata Ingest]
  B[Gmail API] --> M
  C[Calendar/Meet Transcript] --> E[Extraction]
  M --> G[(Graph Store)]
  E --> G
  G --> R[Retrieval Layer]
  R --> L[LLM / Agent]
  L -->|proposes actions| H[Approval Gates]
  H -->|executes| T[Tools: Gmail/Drive/Sheets]

Step 0: choose a graph store (don’t overthink it)

For an MVP:

Postgres tables (nodes, edges) are fine.
Neo4j / Memgraph is great if you want Cypher and multi-hop queries.

The critical thing is not the DB—it’s stable IDs and permission-aware retrieval.

Step 1: ingest Drive metadata + permissions (the cheat code)

Drive metadata is high-signal and low-risk because you’re not reading content yet.

# pseudo-code (Google Drive API)
def ingest_drive_files(drive_service, since_rfc3339: str):
    fields = (
        "files(id, name, mimeType, modifiedTime, owners(emailAddress), "
        "permissions(emailAddress, domain, type, role), webViewLink, parents)"
    )

    page_token = None
    while True:
        resp = drive_service.files().list(
            q=f"modifiedTime > '{since_rfc3339}' and trashed=false",
            fields=f"nextPageToken,{fields}",
            pageToken=page_token,
            pageSize=1000,
            supportsAllDrives=True,
            includeItemsFromAllDrives=True,
        ).execute()

        for f in resp.get("files", []):
            upsert_node("DriveFile", {
                "id": f["id"],
                "name": f.get("name"),
                "mimeType": f.get("mimeType"),
                "modifiedTime": f.get("modifiedTime"),
                "webViewLink": f.get("webViewLink"),
            })
            attach_acl("DriveFile", f["id"], f.get("permissions", []))
            link_owner_edges(f)

        page_token = resp.get("nextPageToken")
        if not page_token:
            break

Why this matters: “permission-aware retrieval” is much easier if you model permissions from the start, even before you do any semantic indexing.

Step 2: ingest Gmail threads (threads > messages)

Gmail messages are messy: quoted history duplicates token count. Threads are a better unit.

# pseudo-code (Gmail API)
def ingest_gmail_threads(gmail_service, query: str = "newer_than:90d"):
    # List threads
    threads = gmail_service.users().threads().list(
        userId="me",
        q=query,
        maxResults=200,
    ).execute().get("threads", [])

    for t in threads:
        full = gmail_service.users().threads().get(userId="me", id=t["id"]).execute()
        subject = guess_subject(full)
        participants = extract_participants(full)
        last_ts = max(int(m.get("internalDate", 0)) for m in full.get("messages", []))

        upsert_node("EmailThread", {
            "id": full["id"],
            "subject": subject,
            "lastMessageAt": last_ts,
            "participantEmails": participants,
        })

        # Link to Customers/Vendors by domain heuristics (later upgraded by extraction)
        link_domains(participants, thread_id=full["id"])

Operator tip: store both the raw thread reference and a “deduped, reasoning-ready” thread text view (built by stripping quoted history and signatures).

Step 3: add lightweight extraction (entities + relationships)

Start with cheap extractors:

email/domain parsing
regex for PO/invoice numbers
fuzzy matching on customer names

Then add model extraction only where it pays.

# pseudo-code: extraction with confidence
def extract_decisions_and_actions(meeting_transcript: str) -> dict:
    prompt = """
    Extract:
    1) Decisions (clear agreements)
    2) Action items (owner + due date if present)
    Return JSON with confidence 0-1.
    """
    result = call_llm(prompt, meeting_transcript)
    return result

def write_to_graph(meeting_id: str, extraction: dict):
    for d in extraction.get("decisions", []):
        decision_id = upsert_node("Decision", d)
        upsert_edge("DECISION_FROM", decision_id, meeting_id)

    for a in extraction.get("action_items", []):
        action_id = upsert_node("ActionItem", a)
        upsert_edge("ACTION_FROM", action_id, meeting_id)
        if a.get("owner_email"):
            upsert_edge("ASSIGNED_TO", action_id, f"person:{a['owner_email']}")

Step 4: build retrieval that uses both vectors and graph

The sweet spot for many teams is hybrid:

Vector index for “find relevant passages.”
Graph for “what’s connected / latest / owned by / approved by.”

Practical pattern:

Use the graph to filter and structure the candidate set (e.g., latest versions, same customer, same project).
Use vectors to rank within that constrained set.

6) GraphRAG vs plain RAG: a decision framework

Here’s the simplest decision rule:

Use plain RAG when the user question is basically: “Find and summarize a document.”
Use GraphRAG / a workspace knowledge graph when the question is: “Resolve ambiguity, recency, or responsibility across multiple artifacts.”

RAG is usually enough for:

“Summarize our Q1 onboarding doc.”
“What does this contract clause say?”

GraphRAG wins when you need:

Latest: “What’s the latest price sheet for Customer X?”
Ownership: “Who can approve changes to the vendor terms?”
Traceability: “Which email thread approved the discount?”
Multi-hop: “Which projects depend on Vendor Y and have renewals next quarter?”

Concrete example (operator query)

Question: “Send me the latest vendor terms and the last negotiated exception.”

Vector RAG might retrieve the terms PDF.
The graph can connect:
- Vendor → DriveFile(terms)
- Vendor → EmailThread(negotiation)
- EmailThread → Decision(exception)
- Decision → Person(approver)

Then the assistant can answer with:

the latest terms link
the negotiation summary
who approved what
and what action is safe to take next

7) Permission-aware by design (no “God mode” search)

A Google Workspace context layer has to assume:

different users see different files
ACLs change
auditability matters

Practical implementation notes:

Propagate ACLs into the graph
- For DriveFile, store the file’s permissions (users/domains/groups) as an ACL blob.
- For derived nodes (e.g., Decision extracted from a doc), inherit ACL from the source artifact.
Enforce retrieval by the requesting user
- The retriever should accept (user_id, query) and only return artifacts the user can access.
Explain “why you’re seeing this”
- “You can see this because it’s shared with your org” is a trust multiplier.
Maintain an audit log
- Which sources were retrieved
- Which actions were proposed
- Which approvals were obtained

This is one reason nNode focuses on working inside the tools teams already use: the assistant should honor your existing permissions model, not bypass it.

8) How the graph powers multi-agent execution (not just answers)

The real payoff isn’t “better Q&A.” It’s reliable execution.

In multi-agent systems, a supervisor agent often needs stable state:

“What customer is this about?”
“What’s the current canonical doc?”
“What’s already been done?” (idempotency)

The graph becomes the shared memory:

Supervisor agent looks up Project/Customer context.
Specialist agents do targeted work (research, costing, drafting).
Outputs are written back as new artifacts linked to the same entities.

That’s the architecture nNode is building toward: an AI operations assistant (“Sam”) that routes work through specialist agents, but stays grounded in a private business context layer.

9) Approval-first execution: where the graph reduces risk

If you let an assistant take actions, the safest default is:

draft first
ask for approval
execute only after explicit confirmation

The graph helps because approvals should be grounded in context:

who the email is going to
which customer/project it belongs to
what sources were used
what changed vs the last version

Practical approval gates:

Sending an external email
Changing Drive sharing permissions
Updating a “system-of-record” Sheet
Generating a quote or invoice

Think “preview diff” for operations.

10) Start small: two starter workflows that justify the graph immediately

You don’t need a platform-wide rollout. Start with two workflows that are painful today.

Workflow A: “Find the latest vendor terms + last email thread”

Trigger: user asks “what are our terms with Vendor Y?”

Graph steps:

Resolve Vendor by domain/name.
Traverse to related DriveFile nodes.
Pick candidate “latest” file using modifiedTime + LATEST_VERSION_OF edges.
Traverse to related EmailThread nodes; pick most recent negotiation thread.

Output: a short brief + links, permission-aware.

Workflow B: “Meeting → decisions → tasks → follow-up email draft (approval required)”

Trigger: meeting ends, transcript arrives.

Graph steps:

Extract Decision + ActionItem.
Link to attendees and project/customer.
Draft follow-up email with bullets: decisions + owners + dates.
Present a draft; user approves; assistant sends.

This is exactly the kind of “meeting to action” loop where a context layer turns a demo into something durable.

11) Common failure modes (and how to avoid them)

Overbuilding schema
- If it doesn’t power a workflow this month, defer it.
Ignoring recency
- “Latest” is a first-class concept. Model it explicitly.
Entity collisions
- Two “Acme” customers? Use domains, external IDs, and confidence.
Permission drift
- Reconcile ACLs regularly; don’t assume static sharing.
Stale graph
- Build incremental sync (modifiedTime/historyId), not batch rebuilds.

12) Practical next steps (this week vs this quarter)

This week (MVP)

Ingest Drive metadata + ACLs
Create DriveFile, Person, Org nodes
Ingest Gmail threads (store thread IDs + participants)
Add domain-based linking (low confidence is fine)
Implement permission-aware retrieval filter

This quarter (operator-grade)

Thread dedupe + “reasoning-ready” email views
Meeting transcript extraction → Decision / ActionItem
LATEST_VERSION_OF modeling for common doc types
Hybrid retrieval: graph-first constraints + vector ranking
Audit log + approval gates for actions

FAQ (for operators and IT/admin stakeholders)

Do I need a “real” knowledge graph database? Not at first. What you need is a stable representation of nodes/edges and permission-aware retrieval. Many teams ship an MVP in Postgres and migrate later if needed.

Will this force us to reorganize Drive? No. The point of a Google Workspace context layer is to work with messy reality. You’ll likely improve hygiene over time, but the graph should not depend on a perfect folder structure.

Is GraphRAG always better than RAG? No. Plain RAG is simpler and often faster for “find and summarize.” GraphRAG pays off when questions require recency, ownership, and multi-hop relationships.

Where nNode fits

nNode is building an AI operations assistant (“Sam”) that works where your business already lives—especially Google Drive and Gmail—so you can get value without migrating into a new system.

The core thesis is exactly what this post described:

a lightweight private context layer (the “Karpathy graph”)
permission-aware retrieval
multi-agent orchestration behind a simple chat interface
approval-first execution

If you’re trying to make an assistant reliably answer operator questions like “what’s the latest, who owns it, what’s connected, what do we do next?”—take a look at what we’re building at https://nnode.ai.