hello try train tesseract new font based on following digits:
all digits provided in png file transparent background. if create box file it, train , on - works fine!
now problem, same situation want train tesseract based on following image:
as can see digits same positions , on. difference image 1 used yellow background , on nothing working anymore. create box file set same positions first image:
0 5 4 20 22 0 1 27 4 38 21 0 2 48 4 60 22 0 3 71 3 83 22 0 4 94 5 109 22 0 5 119 5 131 22 0 6 143 5 157 22 0 7 172 5 184 22 0 8 197 5 211 23 0 9 224 5 238 22 0
well , trained box, resulting .tr file empty didn't stop here , completed other steps. resulting font not possible use!
so question how train tesseract recognize digits no matter background used them?
edit 2016-04-16:
i used imagemagick preprocess images , found command works kind of backgrounds. wanted train tesseract created images, doesn't work thought would... . first of created box files, of them empty. used website organize character positions , spent lot of time make cropping perfectly! afterwards created resulting .tr files , did other stuff train tesseract.
finally got "traineddata", moved file "tessdata" directory of tesseract , used should used:
tesseract example.jpg output -l mg
(i called new font "mg")
okay whatever doesn't recognize or of them! opened thread find help, till nobody has clue how this, sadly... . please me out.
the whole tesseract training files, used , created, u can find here:
tesseract training directory (as no zip/not compressed -> view of files of directory)
you can change color image binary image , use tesseract on it, way no matter color using have same result.
Comments
Post a Comment