Read and extract text and other content from PDFs in C# (port of PDFBox)
A Gtk/Qt front-end to tesseract-ocr.
OCR engine for all the languages
Document Layout Analysis resources repos for development with PdfPig.
Conversions between various OCR formats
Convert between Tesseract hOCR and ALTO XML using XSL stylesheets
Text Overlay plugin for Mirador 3
Ergonomic line-by-line transcription of scanned text.
Text-to-tibble