Social Vulnerability Index Construction: Accessing Open Data from National Censuses


Miguel Toquica - GEM Social Vulnerability and Resilience Specialist shares his insights on GEM's experience in accessing socio-economic data from national censuses and public online databases.


When it comes to accessing the demographic characteristics of the population of a country, researchers usually consider national population and household censuses as reliable sources of information. Ideally, most countries should update their national census data and procedures every 10 years.


The need to keep track of socio-economic factors and statistical measures of societies is recognized globally to better understand the living conditions and characteristics of the population in a specific country. In this regard, national censuses are considered as the most reliable source of such type of information at specific level of territorial organization, i.e. regions, states, parishes, and local level. A national population and housing census has several uses for a country. It provides not just the total number of population and households but also the demographic information for population estimates and specific information for national agencies in the fields of education, health and economy. A national census also gives quantified information of socio-economic conditions of a specific subdivision and groups of people in a country.At GEM we are collecting and processing national census data for our research on what socio-economic conditions could contribute to the population’s vulnerability to natural hazards, i.e. earthquakes, volcanic eruptions, landslides, flooding, hurricanes, and droughts. In most cases, our research using census data have led us to information on the pre-existing characteristics i.e. average household size, unemployment rates, etc. that relate directly to why differential impacts from natural hazards occur across space.


Social vulnerability helps to explain why some areas, such as a country’s sub-national parishes or city neighbourhoods, will experience the consequences of a natural hazard in different ways. Understanding the varying impacts of a natural hazard through social vulnerability assessments is a critical element for risk reduction, elaboration of mitigation plans, and the development of public policies to reduce the risk. To measure social vulnerability, the starting point is to capture the contextual conditions within the social structure of the study area.


This social structure includes characteristics of the population and factors that increase or decrease the impact of natural hazards in the community. These factors include access to basic needs (potable water, electricity, and sanitary services), access to education and health, and characteristics of specific groups within the society that makes them vulnerable, e.g. the elderly population, children, population with disabilities, ethnic groups and so forth. As an example, indigenous people, like the women working in the crafts industry belonging to the Wayuu ethnic community in Colombia (Figure 2), typically live in isolated regions where access to financial means and basic public services like potable water, electricity, as well as public infrastructure is difficult or non-existent.


These conditions may compromise their capacity for disaster preparedness and make it harder for government agencies to respond and conduct recovery efforts, thereby increasing their vulnerability in case of an emergency. In this context, information obtained from national censuses in Latin America has allowed the Social Vulnerability and Resilience (SVR) team at GEM (i) to develop databases for indicators of social vulnerability, and (ii) to construct social vulnerability indices for over 20 countries in Latin America and the Caribbean. This task has been possible thanks to the online access to national census databases made available by several countries in Latin America and the Caribbean such as the CELADE-Redatam.


To start the development and construction of social vulnerability indices, GEM’s SVR team obtained the most recent socioeconomic data from available national population and housing censuses from countries in Latin America and the Caribbean. The collected raw values within the population, economy, infrastructure, health, and education dimensions were then processed to obtain standardised values using percentage, per capita, and density functions that can be used for country comparisons.


In addition, a statistical multivariable analysis has been conducted to select a consistent set of indicators for all countries. The socio-economic variables obtained are then standardised and rescaled to create a set of indicators with the same measurement. The analysis also includes a correlation analysis, which is used to quantify the association between two continuous variables, hence narrowing the data to be selected for the regional set of variables that are acceptable to represent the social vulnerability, economic resilience and recovery capacity of the population in Latin America and the Caribbean. Figure 3 provides an example of the Social Vulnerability Index for Central America and the Caribbean region.



Even though the process of collecting, processing and building SV databases and indices seem quite straight forward, accessing the data from censuses and other sources can prove to be challenging and sometimes frustrating. In some cases, the censuses are not fully available, or they are not provided in the desired working format. Some of the most common challenges we have encountered and possible solutions are outlined below:- There is lack of common data processing techniques that are compatible across all countries.


Trying to keep the standards of data and indicators selected for all countries may not always work as most censuses are conducted on different basis and using different techniques. This may result in slightly different social vulnerability datasets per country, and therefore the final indicator selection and index composition may differ from country to country.


This challenge has been minimized by performing multivariate and correlation analyses on the full set of socio-economic indicators. This technique allows the SVR team to carefully select a set of indicators that better represent the themes of social vulnerability, maintaining the robustness and composition of the index in all cases.- Not all statistical services in each country make the entire census available using a simple database or accessible format.


This fact makes accessing and post-processing of data difficult. Some countries do not even make censuses open and available online. Nonetheless, new techniques of data extraction have been implemented so indexes are built with the most reliable and recent sets of data.- Accessing the most recent data from national censuses can be difficult. Some census data can be as old as early 2000’s and late 1990’s. The use of old data must be considered with caution as final results may be skewed. Keeping information of up-to-date country statistics may provide proxies of specific indicators, for example the total population and employment rate can be updated on a yearly basis for some countries. However, processing quantities using data from different time periods can drastically change the unit of measure of comparable values so special care is fundamental when doing so.


The GEM social vulnerability team has been overcoming the challenges presented, and we keep improving data collection for index construction. We are also proud to produce and make available to the public the subnational social vulnerability databases and indices. The work is fundamental and a pivotal component for other risk information products developed at GEM.


