I didn't expect IBM to be making relevant AI models but this thing is priced at $1 per 4,000,000 output tokens... I'm using it to transcribe handwritten input text and it works very well and super fast.
Love this! Would have liked to see something like textract for a pre-LLM benchmark (but of course that's expensive), and also a distinction between handwritten text and printed one.
Ultimately, there’s some intersection of accuracy x cost x speed that’s ideal, which can be different per use case. We’ll surface all of those metrics shortly so that you can pick the best model for the job along those axes.
We wanted to keep the focus on (1) foundation VLMs and (2) open source OCR models.
We had Mistral previously but had to remove it because their hosted API for OCR was super unstable and returned a lot of garbage results unfortunately.
Paddle, Nanonets, and Chandra being added shortly!
I didn't expect IBM to be making relevant AI models but this thing is priced at $1 per 4,000,000 output tokens... I'm using it to transcribe handwritten input text and it works very well and super fast.
But still, this is incredibly useful!
We had Mistral previously but had to remove it because their hosted API for OCR was super unstable and returned a lot of garbage results unfortunately.
Paddle, Nanonets, and Chandra being added shortly!
[see https://news.ycombinator.com/item?id=45988611 for explanation]