Show HN: ContextD – OCRs your screen activity, use it with LLMs via local API

ContextD is a macOS utility that continuously monitors your screen activity through efficient OCR, stores the extracted text locally, and surfaces it via a local API for integration with AI tools. The application uses smart diffing and keyframe logic to minimize processing while keeping all data on your device, and includes a prompt enrichment feature that automatically adds relevant context from your recent activity to questions you ask AI assistants.

ContextD is a lightweight macOS application that monitors your screen in real-time, extracts text from what you're viewing through optical character recognition, and makes that contextual information available to AI language models via a local API. The entire workflow happens on your machine, with data never leaving your computer except for API calls to OpenRouter.

How It Works

The application takes a methodical approach to capturing and organizing screen activity. Every two seconds, ContextD snapshots your display and compares it to the previous capture. Rather than processing the entire screen, it uses SIMD-accelerated pixel diffing to identify what's changed. The system then runs OCR exclusively on those modified regions, storing the extracted text in a local SQLite database. Screenshots themselves are processed in memory and immediately discarded—no images are retained on disk.

A background process continuously summarizes your activity using Claude Haiku via OpenRouter, keeping costs minimal (roughly $2 per day). This summary data becomes queryable through a local HTTP API, enabling other applications and AI agents to understand what you've been working on without storing raw visual data.

Getting Started

Installation requires macOS 14 or later and Swift 5.9. After cloning the repository and running make build, the application needs to be bundled into an app package to trigger proper macOS permission dialogs. Once launched, you'll grant access to screen recording and accessibility features, then enter your OpenRouter API credentials in settings.

ContextD serves an interactive API on localhost at port 21890, complete with Swagger documentation. You can perform full-text searches across activity summaries, retrieve recent activity spanning specific time windows, or browse captures around particular timestamps.

Enriching Prompts with Context

A standout feature is the prompt enrichment workflow. Press Cmd+Shift+Space to open the enrichment panel, paste your question or request, specify a lookback window, and ContextD automatically appends relevant context from your recent activity. The enriched prompt includes footnoted references to specific moments—showing exactly when and where relevant information appeared on your screen. This lets you feed AI assistants highly contextual requests without manually copying and pasting details.

Customization and Control

The settings panel allows you to adjust the API key, capture frequency, which models to use for summarization versus enrichment, token limits, and data retention policies. The capture pipeline is configurable: the system decides whether each frame is a full keyframe (capturing the entire screen) or a delta update (capturing only changed regions) based on the percentage of pixels that shifted. Developers can inspect the database directly using provided make targets to review statistics, recent captures, and search results.

Source: Hacker News Show HN

‘Uncanny Valley’: Nvidia’s ‘Super Bowl of AI,’ Tesla Disappoints, and Meta’s VR Metaverse ‘Shutdown’

This episode of WIRED's Uncanny Valley podcast examined Nvidia's annual developer conference, where CEO Jensen Huang projected $1 trillion in AI chip revenue opportunities and unveiled a new product from its partnership with Groq. The show also covered Tesla's deteriorating relationship with influential online supporters and Meta's partial reversal of its decision to shut down the Horizon Worlds VR platform, revealing the company's struggle with its metaverse vision.

Show HN: Untitled88 – Query your QuickBooks data in plain English

Untitled88 has launched a QuickBooks integration that enables users to query their financial data using plain English rather than traditional database queries or software navigation. The tool simplifies financial data access for business owners and accountants without technical backgrounds, making it easier to extract insights from QuickBooks records on demand.

EsoLang-Bench: Evaluating Genuine Reasoning in LLMs via Esoteric Languages

EsoLang-Bench is a new evaluation framework that tests whether large language models truly understand programming concepts by having them solve tasks in obscure, rarely-used esoteric programming languages. By moving beyond common training data, the benchmark aims to distinguish genuine reasoning from simple pattern matching in AI systems.