Damián Vidal

← Projects

SIFT

Summarize Inputs Faster than Time allows. Personal summarizer for news, video, and audio.

Stack: Python, Gemini, MLX Whisper, Telegram Bot, Browser Extension

Why

This started one morning at breakfast. I was reading the news and queueing up videos to stay current — and realizing, again, that the 40-minute video I’d just bookmarked was probably never going to get watched. The hour-long podcast either. Most days the queue grew faster than I could drain it.

The summarization tools that existed didn’t fix it. NotebookLM is great but lives in Google’s cloud. Snipd, Glasp, Reader — all of them either send your inputs to a third party or charge per use. What I actually wanted was small and selfish: my own machine, my own files, take a YouTube URL or a local video or a news article and give me something I could read in five minutes that wasn’t just a wall of bullet points.

So I wrote a script. yts. It downloaded the video, transcribed it locally with MLX Whisper, and ran the transcript through Gemini. That worked. Then friends saw it, asked if they could use it, and I wrapped it in a Telegram bot. Now it has three entry points (Telegram, web, CLI) and three input sources (browser extension for news, URLs for online video, local paths for everything else).

The output isn’t a wall of bullets. It’s a self-contained HTML page with a mind map of the ideas, the recurring themes, the standout comments — the most interesting things people said in the comment section, ranked. Those last two are what I actually use most. The comments often tell you more than the video does. The mind map shows me at a glance whether a piece is worth my time at all.

What it does

How it works

A personal summarization system with three entry points and three input sources.

Entry points: a Telegram bot for on-the-go summaries, a web interface for the desktop, and a set of Python CLI scripts wired to shell aliases for one-line use from the terminal.

Input sources: a Brave browser extension that captures news articles via archive.org, URLs for online videos (YouTube and similar), and local file paths for audio/video files I already have on disk.

Text summarization runs on Gemini. Audio and video transcription runs locally on MLX Whisper (Turbo v3) — fast on M-series Macs and never sends content to a third party. Output is a self-contained HTML artifact with a mind map (Mermaid + Cytoscape), key ideas, recurring themes, critiques, and standout comments. The HTML opens in any browser, no server required.

Self-contained HTML output for a YouTube video — mind map, key ideas, recurring themes, and standout comments.

What it does, at a glance — real-time status, archive notifications, summarization features.

Archive Reader — one of the entry points, captures news articles via archive.org.

Status

In active personal use.

See it