PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Compact VLM huggingface.co 12 points by daemonologist a day ago
daemonologist a day ago Original title: PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language ModelDirect link to PDF: https://ernie.baidu.com/blog/publication/PaddleOCR-VL_Techni...Baidu claims state of the art performance on their own OmniDocBench (although some recent models like GPT-5 and Qwen3 are not evaluated) and strong results on olmOCR-Bench and Ocean-OCR-Bench.
daft_pink a day ago Wow is this commercially usable? I think the Microsoft and IBM top class document parsing is non commercial use only. daemonologist a day ago Yes, Apache 2.
Original title: PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model
Direct link to PDF: https://ernie.baidu.com/blog/publication/PaddleOCR-VL_Techni...
Baidu claims state of the art performance on their own OmniDocBench (although some recent models like GPT-5 and Qwen3 are not evaluated) and strong results on olmOCR-Bench and Ocean-OCR-Bench.
Wow is this commercially usable? I think the Microsoft and IBM top class document parsing is non commercial use only.
Yes, Apache 2.