Using admin data and machine learning to predict dwelling occupancy on Census Night

Date:

Abstract

The Australian Census of Population and Housing (the Census) aims to count every person in Australia on a particular night – called the Census night. Houses which do not complete a Census form and do not respond to the Australian Bureau of Statistics’ (ABS) follow-up campaign, pose a complication to achieving this aim: Are these dwellings unoccupied, or are they occupied and the residents unresponsive? To achieve its aim, the Census should count these unresponsive residents, but how can the ABS accurately do this? To answer these questions, the ABS has developed a model which uses administrative data – collected by various government and non-government organisations – to predict the occupancy status of a dwelling. There are various challenges surrounding this new method, including the lack of ground truth, and the presence of strongly unbalanced classes. However, the method will improve the accuracy of ABS Census population counts and has been adopted as part of the 2021 Australian Census imputation process.