(https://github.com/allenai/olmocr)
Toolkit for linearizing PDFs for LLM datasets/training
Created by allenai on Sep 17, 2024.
Posted
by
David Bisset
Hashtags: Python