A learning base generation application for handwriting recognition

Fourth year annual project of the INSA computer science department

Why Taliesin?

description psychology

Handwrittenand especially old documents aredifficult to read.To exploit them, we can use recognizers based onartificial intelligence..

backup storage

However, automatic recognition requiresa lot of training data.Therefore, thousands of examples need to be annotated.

find_in_page edit file_present

Taliesin facilitates theimport, slicingand annotationof handwritten documents and thus appears as a solution for the generation oflearning bases.A training base is a set of annotated examples containing images and their associated transcriptions.

Presentation of the project

Taliesin is a application for generating learning bases for handwriting recognition systems. These data allow recognizers to generate a model capable of making predictions on new documents.

To generate these learning databases, Taliesin offers an interface that facilitates the work of annotators. The training data sets are automatically generated thanks to deep neural network-based recognizers that annotate the different pages. In case of inconsistency, the user can modify the prediction manually using auto-completion. Once the image database is annotated, the user can export the examples and use them to train handwriting recognizers.

Interface

Our team

Our team is composed of seven students in their fourth year at the INSA Rennes INFORMATICS department.

Matisse BABONNEAU

Thomas BETTON

Corentin DUFOURG

Fabien LEFOYE

Elise MAUVIEUX

Glen POULIQUEN

Yuzhan WANG

Our partners

We would like to thank all our partners as well as our supervisors Alexandre GIMENEZ PUIG and ERWAN FOUCHE engineers at Sopra Steria as well as Bertrand COUASNON teacher researcher INSA/IRISA

Archives Départementales d'Ille-et-Vilaine
Archives

The Archives départementales d'Ille-et-Vilaine provide us with handwritten documents and are part of the Taliesin beta testers.

Sopra Steria
Sopra Steria

French company of digital services. Two of its engineers bring us their experience to learn how to work better in a team, to manage a project in agile and to accompany us on the technical part.

INSA Rennes
INSA Rennes

Our engineering school thanks to which we were able to realize this project.

Doptim
Doptim

Company creating AI and Big Data solutions in Brittany. One of its goals is to enable the transcription of old handwritten parish registers in the form of digital text. Doptim is one of Taliesin's beta-testers.

IntuiDoc
IntuiDoc

The IntuiDoc team of IRISA focuses its research on handwriting, gesture and document processing. The team provides us with recognizers and are beta-testers of Taliesin.