Presentation

  • Developers

    The design and the development of the platform are realized by a team of students of 4th year Computer Science at INSA Rennes.

  • Customers

    The departmental archives of Ille-et-Vilaine digitize enormous quantities of documents to put them publicly available.

  • Partners

    The algorithms of the IntuiDoc team recognize the text on the digitized archives and allow to operate these documents.

  • Partners

    Doptim is a start-up specialized in data analysis and Big Data technologies. She is working on a product that turns into a genealogist to save time in digging and tediously decrypting millions of digital documents.

Context

The departmental archives of Ille-et-Vilaine have for mission to keep a track of the documents and to put them publicly available. To face a very big volume of data as well and a big diversity of documents, archives committed a big campaign of digitalization. The Gutemberg project was introduced to develop a platform allowing to provide to the public these digitized documents.

  • Volume Huge volume of data
  • Plurality Big diversity of documents
  • Consultation Put publicly these documents
  • Usability Consultation and simple research
  • Collaborative Share the knowledge

The application

The Gutemberg project is a Web application allowing to consult old digitized documents, given by the departmental archives of Ille-et-Vilaine. The application is centered on the consultation thanks to its interface allowing a simplified utilisation on smartphones and tablets.

Gutemberg is a web application focused on document consultation in a simple and ergonomic way so that a maximum of heritage conservation stakeholders can contribute to the enrichment of the platform. We propose a generalist solution, applicable to all types of documents, and collaborative by giving a large part to annotations. The contribution of users will make it possible to fill in missing information where a character recognition tool would not have or misinterpreted text because of poorly readable handwriting or the use of particular vocabulary. This data will be valuable for keyword searches on documents.

The genericity of storage will allow actors like Doptim to easily exploit the content of platform documents for genealogical purposes for example. It is for this type of application that the interest of this platform is justified: a simple request would make it possible to avoid hours of searches of registers in the archives.

The long hours of search between the shelves of departmental archives are over. The search tool serves to obtain in a fast and effective way documents wished by the user. The application also proposes a tool of advanced research which allows to specify the criteria of the request.

Gutemberg is an application multi-documents, it thus allows to put publicly available a large number of document of every types: old press, register of roll, decree of naturalization...

The choice of a responsive interface made Gutemberg an elegant tool for the consultation of documents for different sizes. The user can consult, annotate or simply read a document, whether it is via a mouse or a tactile device.

The system of annotations is making Gutemberg a collaborative application, allowing to share and to confront different interpretations from the same document and to offer several readings possible for the user.

The application allows every reader to register in order to collaborate to the project by proposing its own annotations. These ones will be then visible by the rest of the users during the consultation of the document.

User's system of Gutemberg allows the archives to offer a real collaborative aspect to the readers while keeping a system allowing, via moderators and administrators, to manage this community very easily. An intuitive administrator interface allows administrators to manage these user accounts, add new documents, and annotation form formats.

Gutemberg has been designed to support a large number of documents, and to use the MongoDB NoSQL database or the ElasticSearch indexing and search engine.

The team

 
 

Nolwenn

 

Clément

 
 

Lucas

 

Maxime