Processing the pictures
The first step of the project is to get millions of images from a hundred archive depositories, understand the document structure, and recognize the handwritten text. A portal will allow archive depositories and their publishers to upload images and associated metadata. Then, automated methods will be mobilized to extract the information they contain: line detection, text recognition, consistency tests, etc. The whole process will eventually contribute to produce a "raw" database (RTD) reproducing as closely as possible the listes nominatives: each image will be associated with the text it contains.