Software


Optical Character Recognition (OCR) Annotation Tools

Precisely annotated data are necessary for training an OCR model as well as for evaluation of OCR methods. Therefore, we propose and implement two tools utilizing machine learning that simplify the annotation process. These tools create ground truths for line images that are used for training of nowadays OCR systems.

Character Segmenter

Character Segmenter is used for segmenting the text lines into individual characters. It additionally saves the annotated line images as well as the character separator positions.

Line Annotator

Line Annotator utilizes the trained OCR model to predict the character transcription of a line image. It is then checked by a human annotator and corrected if needed.

For further information about these tools, see the paper below. Please, cite the following paper when you use the tools or source codes.



Enhanced Local Binary Patterns (E-LBP) method

This is an implementation of a novel automatic face recognition approach based on local binary patterns (LBP). LBP descriptor considers a local neighbourhood of a pixel to compute the features. This method is not very robust to handle image noise, variances and different illumination conditions. This method address these issues and extend the original LBP operator by considering more pixels and different neighbourhoods to compute the feature vector and propose enhanced local binary patterns (E-LBP) method.

We have evaluated this method on two benchmark corpora, namely UFI and FERET face datasets. We experimentally show that our approach is very efficient, because it significantly outperforms several other state-of-the-art methods and is efficient particularly in the real conditions where the above mentioned issues are obvious.

For further information about E-LBP approach, see the paper below. Please, cite this paper when you use these source codes.



RecSpe - Automatic Speaker Recognition Toolkit

RecSpe is a toolkit for automatic speaker recognition. It proposes the following main functionality: speech recording; speech parametrisation (MFCC, PLPC, LPREFC, LPCEPSTRA and Discrete Wavelet Transform algorithms available); speaker classification (training and testing procedures; some different classifiers such as GMM, MLP, etc.); speaker segmentation (if more speakers are speaking). RecSpe is based on the Qt Plug-in system so its functionality can be easily extended.



jDALabeler - tool for Dialog Act Corpus Labeling

jDALabeler is a tool for manual Dialog Act (DA) corpus labeling. Dialog acts are saved in the predefined schemes (Meeting Recorder Dialogue Act, Verbmobil, etc). jDALabeler also allows to create additional DA schemes, if necessary.



AutoFaceRec - Automatic Face Recognition System

Automatic Face Recognition System (AutoFaceRec) is a tool-kit designed for face detection and automatic recognition from real-world photographs. It means recognizing people in ordinary photographs that are not acquired in controlled environment. The quality of such photographs is significantly lower than in the case of photographs usually used for testing of Automatic Face Recognition (AFR) methods. The face in these photographs is often rotated, tilted or occluded and the pose is not uniform. Therefore, the recognition from such photographs is very difficult. This tool-kit allows creating a fully automated face recognition system from the following modules depending on the needs of the users. Five main modules are implemented:



Corpora

Czech Text Document Corpus v 1.0

The corpus contains Czech newspaper articles provided by the Czech News Agency (CTK). The articles are annotated by the labels selected from the 60 categories.

This corpus is available only for research purposes for free. Commercial use in any form is strictly excluded. For further information about the corpus, you can see the paper below:

  • M. Hrala, P. Kral, Evaluation of the Document Classification Approaches, in 8th International Conference on Computer Recognition Systems (CORES 2013), Milkow, Poland, 27-29 May 2013, pp. 877-885, FullText.
  • Please, cite this paper when you used these texts in the experiments.



    Real Face Recognition Corpus (REFARECO) v 1.0

    The Real Face Recognition Corpus (REFARECO) is a set of real-world photographs randomly selected from the large Photobank of the Czech News Agency. It is intended to be used for evaluation of the face detection and automatic face recognition algorithms. It is composed of the images of individuals taken in uncontrolled environment. All images were obtained during a long time period (20 years or more). The corpus contains grayscale images of 561 individuals of the size 384 x 384 pixels. At least 10 images for each person are available.

    This corpus is available only for research purposes for free. Commercial use in any form is strictly excluded.

    It is possible to download directly only the sample of the corpus because of the large corpus size. The whole corpus will be sent at the DVD upon the request to the authors: llenc@kiv.zcu.cz or pkral@kiv.zcu.cz.