OCR tool in C using SDL2 and LLMs (rectangle select –> text)

(github.com)

2 points | by haschka 2 hours ago

1 comments

haschka 2 hours ago
I built a small OCR tool in C that lets you select a region of an image and send it to an LLM for text recognition.
It uses SDL2 for rendering and libcurl for the network part, and works with both local servers (llama.cpp-style) and in theory remote APIs.
The workflow is: open image -> zoom/pan -> draw rectangle -> send -> get text
I wanted something lightweight and easy to understand, without large frameworks, and also as a way to experiment with vision-capable models in a simple pipeline.
Some features:
rectangle selection UI zoom and pan cancel running requests minimal dependencies
It’s still pretty early, but usable. https://github.com/haschka/ocr_tool