Cognitive System: Context Graphs for Finance
Node 2What the Context Graph Actually Looks Like Right Now
As of this morning's dump: 37 nodes, 53 edges. The honest breakdown is more complicated — but before we get to the problems, here is one thing the graph got exactly right.
A real signal the graph captured this week
Between May 20–22, Foreign Institutional Investors sold a net ₹7,928 Cr across three sessions. Domestic Institutional Investors absorbed every rupee of it — buying ₹10,463 Cr over the same period. The Indian market didn't collapse. It held.
The graph didn't just record those two numbers separately. It connected them:
Four hops. One causal chain. Captured from fii_dii_scraper.py across May 20–22 sessions.
At the same time, india_pulse_scraper.py picked up UPI data for April 2026 — 22.35 billion transactions worth ₹29 lakh crore. The graph connected this as a domestic consumption signal sitting alongside the FII exit:
UPI April 2026: 22.35B transactions · ₹29 lakh crore value
Graph edge: India Macro Pulse → HAS_VALUE → UPI
What this means in context: while foreign capital was exiting, domestic transaction volume was at record levels. The graph held both facts simultaneously — not as separate data points, but as a combined picture of who was leaving and who was staying.
That is the graph working as intended. The problem is that most of the other 49 edges are noise around this core signal.
The scraper concentration
47 of 53 edges came from one scraper: fii_dii_scraper.py. That's 89% of the entire intelligence graph produced by a single data feed.
The duplicate node problem
The FII/DII signal is real. But because each scraper run creates its own version of the same entity, the graph can't connect the chain cleanly. The same real-world actor — Foreign Institutional Investors — exists as three separate nodes:
This means a traversal starting from FIIs and one starting from Foreign Institutional Investors follow completely separate paths — even though they describe the same actor. The four-hop chain above only works because those edges happened to use the same node name in that run. Across runs, the chain breaks.
What the quarantine found
This morning we ran a quarantine audit. The system flagged and removed data that was sourced from test scrapers, publicly unverifiable, or demo content that reached production.
The most instructive case:
Claimed: Salil Parekh made a market purchase of INFY shares worth ₹420.5 lakhs in mid-May 2026.
Reality: SEBI filings show he exercised grants (received shares) on ~May 2, then sold 95,800 shares on May 8. No open-market purchase in mid-May.
Result: Edge Salil Parekh → INSIDER_BUY → INFY removed. The graph had it as a bullish signal. It was the opposite.
A second quarantine run removed 54 nodes and 37 edges that were sensor infrastructure — edges tracking whether sensors were alive, not encoding market intelligence. The graph was mixing pipeline telemetry and investment signal. Those are two different layers.
The OOM crash
At 3:04 PM the service crashed. Render logged it: "Ran out of memory — used over 512MB while running your code." This happened when we pushed all 27 sensors to write simultaneously. The process exceeded the instance memory limit.
The 27 sensors now active
Two things that have to work first
CUSHIONS, CUSHIONS_AGAINST, CUSHIONS_EFFECTS_OF, CUSHIONS_POSITIVELY, COUNTERACTS, MITIGATED_BY — six ways of saying the same thing between the same nodes. Collapsing to ~12 controlled types. When a scraper fires again on an existing relationship, it updates the weight — not a new edge.Next: what the graph looks like after seven days of 27-sensor operation.