• Bokmål
  • English

Sitemap

Extracting information from historical Norwegian census forms

Extracting information from historical Norwegian census forms

This is part of a larger initiative to establish a historical population register for Norway.  Establishing such a database requires transcription of a large amount of historical records to be able to enter the information in the database. Where possible it is desirable to be able to automate parts of the transcription process for these documents. In this context NR has been working with image based methods to automate extraction of information from historical Norwegian census forms.

The focus has been on two types of census forms. The first is the single sheet form used in the 1891 census. These sheets contain information only about one individual and the forms have no tabular structure. Instead information is entered either as underlined options as answers to specific questions or as handwritten numbers or text on designated dotted lines. The aim of the work has been to extract underlined information and recognize handwritten digits from the individual sheets from these forms. The solution includes image registration, underline detection and recognition of handwritten numerals.

The second type of census sheets have a more common tabular structure, where all members of a household are listed sequentially on one sheet. The approaches being developed for these types of forms include image registration, table extraction, recognition of handwritten numbers and splitting and clustering of handwritten names.

Analysing fields from census forms.

 

Department

Financing

  • The Research Council of Norway
  • Mohn-fund at University of Tromsø
  • The Norwegian Institute of Public Health

Partners

  • University of Tromsø
  • Norsk Regnesentral.
  • The Norwegian Institute of Public Health
  • Statistics Norway
  • the National Archive
  • University of Oslo
  • Institute of local history
  • Association of computer genealogy
Postal address:
Norsk Regnesentral/
Norwegian Computing Center
P.O. Box 114 Blindern
NO-0314 Oslo
Norway
Visit address:
Norsk Regnesentral
Gaustadalleen 23a
Kristen Nygaards hus
NO-0373 Oslo.
Phone:
(+47) 22 85 25 00
Address How to get to NR
Social media Share on social media
Privacy policy Privacy policy
Postal address: Norsk Regnesentral/Norwegian Computing Center, P.O. Box 114 Blindern, NO-0314 Oslo, Norway
Visit address: Norsk Regnesentral, Gaustadalleen 23a, Kristen Nygaards hus, NO-0373 Oslo.
Phone: (+47) 22 85 25 00
AddressHow to get to NR