Tesseract is a pretty decent open source OCR engine that was developed by HP back in the day, but is now maintained as open source by Google. It has a C++ API that you can program to, and as you would expect there are a number of wrappers (of varying quality) that allow you to use libtesseract from Python. For various reasons, none of those fit my needs, so I created my own SIP-based wrapper instead. I will not be maintaining this as a format project, but if you want an apache licensed python wrapper, you can find the code on github. 🙂