Projet de recherche doctoral numero :8209

Description

Date depot: 21 septembre 2021
Titre: Prediction of demographic indicators from remote sensing images
Directeur de thèse: Laurent WENDLING (LIPADE)
Domaine scientifique: Sciences et technologies de l'information et de la communication
Thématique CNRS : Images et vision

Resumé: Objective (Diip Funding) In this PhD, which stems from and strenghtens an on-going collaboration between LIPADE and INED, the candidate will develop deep learning based methodologies using remote sensing data to predict indicators of the environment and environmental change, for demographic analysis. As such, the objective of this topic is twofold: to propose methodological contributions for the large-scale extraction of diachronic environmental indicators and to analyze their contribution to spatial popu- lation and health analyses. How do these indicators compare with the existing environmental data? What results do they yield in terms of the impact of environmental characteristics and environmental change on population structure and health in Sub-Saharan Africa? We expect prime results in the field of computer science (innovative methodologies) and demography (a better understanding of local inequalities in terms of population structure and health) as well as a contribution to the use of fine remote sensing data analysis for population studies. Contexte and Subject In a globalized context increasingly impacted by climate change, undergoing rapid population growth and urbanization, demographic studies would gain to better take environmental data into account and be carried out at the transnational level. However, this is not always possible in Sub-Saharan Africa, as matching harmonized demographic and environmental data are seldom available. The large amount of spatial data regularly acquired since 2015 (in 2019 only, Sentinel satellites from the European Space Agency produced 7.54 PiB of open-access data1) are an opportunity to produce standardized and up-to-date indicators. Several indicators have been developed to help understanding geographical realities in a consistent (i.e. not location dependent) manner. Among them, local climate zones (LCZ) have been proposed by WUDAPT (World Urban Database and Access Portal Tools) to systematically label urban areas. Their goal is to provide a map of the world following this legend, in open-access, that can later be used by researchers for a wide range of studies. This data has been used to understand energy usage, climate or geoscience modeling or land consumption. An important amount of work has been dedicated in the recent years to the automatic generation of such data, from sensors such as Landsat 8 or Sentinel 2. In a research competition organized by the IEEE IADF, several methods have been proposed to map LCZ from Landsat, Sentinel 2 and OpenStreetMap data. Another recent study focused on the usage of Convolutional Neural Networks (CNNs) to tackle the task of automatically mapping LCZ using deep learning and a large scale benchmark dataset was proposed in, with a baseline of an attention-based CNN. However, these works mostly focused on developed urban areas. For instance, the challenge of targeted Berlin, Hong Kong, Paris, Rome, São Paulo, Amsterdam, Chicago, Madrid, and Xi’An. This is problematic, as developed cities are generally well mapped through governmental censuses, and that spatial generalization of machine learning based methods is a challenge. It is therefore necessary to develop adapted methods for the global South. In DHS surveys, a geospatial covariate dataset corresponding to the approximate locations of the clusters interviewed can be matched to household, male and female datasets. The geospatial data stems from international programs aiming at providing estimates of environmental variables at the scale of planet Earth, such as population (Worldpop), temperature and rainfall (CRU, Worldclim), vegetation (VIP ), urbanisation (GHSL), . . . . Most of these are based on large scale estimates derived from Landsat data and are defined over a 10km buffer zone in rural areas, 2 km in urban areas. However, a smaller buffer around a precise location has been proven to bring out better results . We therefore can expect that a high quality localised indicator such as LCZ in African urban metropolises would bring out better results when introduced in demographic analyses. Co-direction : Valérie Golaz (INED) Co-encadrement : Sylvain Lobry (LIPADE-EDITE) ; Géraldine Duthe (INED)

Doctorant.e: Rousse Basile