Show HN: One-liner CLI for batched PDF-to-Markdown at $1 per ~6k pages

github.com

8 points by monatis 3 days ago

Extracting clean text from PDFs is still a mess. Tools like dockling and marker do a decent job—but they’re slow and resource-hungry. pymupdf4llm is fast, but it’s AGPL-licensed, which means you'd need to open-source everything that talks to it—even over the network.

Gemini Batch Prediction gives you blazing throughput and unbeatable pricing—$1 for 6,000 pages. The catch? It’s a pain to use.

That is, until now. We wrapped it up in a few friendly CLI commands—simple enough for your grandparents to enjoy.