using the household finance and consumption survey (hfcs ...taxation (piketty, 2011, 2014) and the...

1

Paper for the Sixth Meeting of the Society for the Study of Economic Inequality (ECINEQ), July 13-15,

2015, Luxembourg

Using the Household Finance and Consumption Survey (HFCS) for a joint

assessment of income and wealth taxes: Prospects, limitations and

suggestions for policy simulations

<Draft paper, please do not quote>

Francesco Figari1, Sarah Kuypers2, Gerlinde Verbist2 1 University of Insubria and ISER University of Essex

2 Centre for Social Policy, University of Antwerp

Abstract We explore the prospects for using the Eurosystem Household Finance and Consumption Survey

(HFCS) dataset as an underlying micro-database for policy simulation across euro zone countries. In

particular, we consider the issues to be addressed and the advantages arising from building a

database from the HFCS for the EU tax-benefit model, EUROMOD. EUROMOD is currently running

mostly on EU-SILC data, but is built in a way that maximises its flexibility and possibility to simulate

tax-benefit policies on different databases. This will allow expanding the policy domains currently

covered in EUROMOD with dimensions like wealth taxation, which recently gained much

prominence, in the academic as well as the public debate. As the HFCS only contains gross income

amounts which are not suitable for redistributive analysis, the purpose of this paper is to derive net

incomes by simulating the gross-to-net transition with EUROMOD taking into account all important

details of the social security and personal income system. In order to identify the issues and illustrate

their importance a trial database for Belgium is constructed. We conclude that, although

transforming the HFCS into a database for EUROMOD would require a significant amount of effort,

this is surely to be worthwhile because of the interesting possibilities to extend the policy scope of

EUROMOD and to consider jointly the redistributive effect of income and wealth taxes. Moreover,

the derivation of disposable income allows one to consider the joint distribution of income, wealth

and consumption, which can be used to analyse issues relating to inequality and poverty.

Key words: EUROMOD, HFCS, simulations, gross-to-net incomes, wealth taxation

JEL Classification: C15, H24, I3

2

Using the Household Finance and Consumption Survey (HFCS) for a joint

assessment income and wealth taxes: Prospects, limitations and

suggestions for policy simulations

Francesco Figari1, Sarah Kuypers2, Gerlinde Verbist2 1 University of Insubria and ISER University of Essex

2 Centre for Social Policy, University of Antwerp

1 Introduction

The increasing accumulation of private wealth in Europe appears as one of the most striking

evolutions in the distributional literature over the last 40 years. The aggregate private wealth-

national income ratios have nowadays returned to levels observed in the 19th century, ranging from

400% to 700 %. Such levels are determined by different economic factors, such as the long-run asset

price recovery effect, high saving rates and low economic growth rates, at least partially sustained by

pro-capital policies (Piketty and Saez, 2013). Focusing on household resources the ratios between

private wealth and disposable income are even higher; the rise of freely provided public services and

in-kind transfers such as health and education that occurred since the 1970s is an important factor in

explaining why disposable income has declined relative to national income (Piketty and Zucman,

2014).

High wealth-income ratios are not necessarily bad but they raise challenging issues about capital

taxation (Piketty, 2011, 2014) and the overall structure of inequality (Davies, 2009). First, while in the

last 50 years the contribution of wealth taxes to government revenues has diminished, there are

strong arguments nowadays for broadening the existing tax bases to include wealth and income from

wealth both on horizontal and vertical equity grounds. As long as wealth is more unequally

distributed than income, wealth taxes are attractive in distributional terms. Moreover, as far as

economic efficiency is concerned, wealth taxes minimise economic distortions by taxing fixed factors

(Hills, 2013). In a recent contribution, Piketty (2013) proposes to go beyond the national boundaries

and suggests the introduction of a comprehensive wealth tax at the European level, based on the

market value of the net personal worth. Second, given that wealth is in general very concentrated

(with a Gini coefficient ranging between 0.5 and 0.8 over time and across countries) and correlated

with income, the inequality of wealth is likely to exacerbate overall inequality. Taxing wealth is a way

to reduce this inequality. Hence, it is important to assess the role of the different wealth components

across countries, in order to set appropriate tax-free allowances and concentrate the tax burden on

the wealthy part of the population, given the increasing role of housing assets in the household’s

portfolio along the entire income distribution (Figari, 2013).

In such a context the need for more comprehensive and integrated data on individual well-being is

widely recognised, as highlighted in the Report by the Commission on the Measurement of Economic

Performance and Social Progress (Stiglitz, Sen and Fitoussi, 2009). In order to identify better

3

measures of economic performance in a complex economy and thus going Beyond GDP, Stiglitz, Sen

and Fitoussi recommend to consider income, consumption and wealth and to give more prominence

to their joint distribution. New household surveys as those developed as part of the Luxembourg

Wealth Study (Jäntti et al. 2013) and the Eurosystem Household Finance and Consumption Network

(HFCN, 2013a) represent a milestone in this ongoing process to better measure individual well-being.

Nevertheless, empirical research faces important and severe limitations that limit potential future

attainments mainly due to data availability. This paper aims at contributing to the recent

developments in this area by exploring the prospects for using the Eurosystem Household Finance

and Consumption Survey (HFCS) dataset as an underlying database for a tax-benefit microsimulation

model. In particular, we consider the issues to be addressed and the advantages arising from building

a database from the HFCS for the EU-wide tax-benefit model, EUROMOD. Although it is currently

running mostly on EU-SILC data, EUROMOD is built in a way that maximises its flexibility and

possibility to simulate tax-benefit policies on different databases (Sutherland and Figari, 2013).

The main advantages of incorporating the HFCS data in EUROMOD are twofold. First, it allows us to

expand the policy domains currently covered in EUROMOD with dimensions like wealth taxation,

which recently gained much prominence in the academic as well as the public debate. In addition to

budgetary and distributional analysis of current wealth taxes, the model based on HFCS data would

allow for an integrated assessment of taxable capacity taking into account direct taxes on income

and wealth and tackling challenging issues such as those faced by ‘asset rich/income poor’

households (Hills, 2013). Moreover, it would enable to estimate the impact of reforms in wealth

taxation in interaction with other tax-benefit policies. Second, as the HFCS contains only gross

income amounts which are not suitable for redistributive analysis, we derive net incomes by

simulating the gross-to-net transition with EUROMOD taking into account all important details of the

social security and personal income system. For the first time, this allows us to consider the joint

distribution of disposable income, wealth and consumption based on information coming from the

same survey, potentially comparable across countries and time.

In order to identify the issues and illustrate their importance a trial database for Belgium is

constructed. We conclude that, although transforming the HFCS into a database for EUROMOD

would require a significant amount of effort, this is surely to be worthwhile because of the

interesting possibilities to extend the policy scope of EUROMOD and also to consider jointly the

redistributive effects of income and wealth taxes. Moreover, the derivation of disposable income

allows one to consider the joint distribution of income, wealth and consumption.

In the next section we briefly describe the advantages and limitations of tax-benefit models. In

section 3 we discuss what the HFCS data can contribute to policy simulation in EUROMOD. In section

4 we discuss the assumptions and transformations needed to construct a EUROMOD database on the

basis of the HFCS data, where Belgium is used as a case study. Section 5 then studies the results of

the derivation of net incomes for the HFCS data and validates them against the EU-SILC and where

possible external sources. The last section concludes.

4

2 Purpose of a tax-benefit model on the HFCS data

The main advantage of a tax-benefit model is that it allows one to focus quite accurately on the

objectives of social and economic policy, on the tools employed, and on the structural change

experienced by those to whom the measures apply. Unlike a macroeconomic model, a

microsimulation model allows one to simulate individual decision units. These decision units are in

the case of the HFCS data households and the individuals that live in them. Fiscal rules are

incorporated into the model as accurately as possible, so that the impact on the individual

characteristics of a decision unit becomes apparent; the impact of social security and taxation may,

after all, vary considerably for different units. The various decision units may also be aggregated

according to different characteristics (e.g. age, social and professional category). As such, the model

allows one to test the redistributive potential of different tax-benefit systems, while taking due

account of social and demographic variables. Another important advantage of this method is that it

allows one to study a set of policy measures from two distinct perspectives. On the one hand, one

can focus on the cumulative effect of the various measures, and therefore also on the impact of the

entire set of transfer-oriented measures. On the other hand, a microsimulation model offers the

possibility of dissecting complex measures (e.g. step-by-step tax calculation for a household), so that

the impact of each step may be considered separately.

As described in Figari et al. (2015) different types of analysis are facilitated by using a

microsimulation approach, among else:

- impact of tax-benefit policy changes (e.g. reforms regarding wealth and income taxation) on

income-based indicators and related statistics (e.g. poverty and inequality indicators);

- impact of demographic factors on disposable income through the effects of tax-benefit

policies (e.g. public support to families contingent on presence of children, see Figari et al.,

2011);

- impact of policy changes over time (e.g. profiles of gainers and losers of a policy indexation

and policy reforms);

- impact of policy changes on social indicators capturing work incentives (e.g. effective

marginal tax rates, participation tax rates) or social inclusion (e.g. multiple deprivation).

Of course, simulation models also have inherent limitations. These models use empirical data that

are either obtained by means of surveys or from administrative sources. As such, the accuracy of the

results depends on the quality of the data (e.g. adequate information about the relevant socio-

economic characteristics, a sufficiently large sample). Another limitation is the cost involved in

constructing and maintaining such a model: developing a tax-benefit model requires time and

money. Therefore, one will need to make certain considerations in terms of policy areas covered,

incorporation of demographic and macro-economic processes or behavioural reactions.

In order to exploit the cross-country dimension of the HFCS data, it is quite natural to build a

database from the HFCS for EUROMOD, the EU-wide tax-benefit model, rather than for separate

national tax-benefit models. Moreover, EUROMOD is built in a way that maximises its flexibility and

possibility to simulate tax-benefit policies on different databases.

5

EUROMOD simulates cash benefit entitlements and direct tax and social insurance contribution

liabilities on the basis of the tax-benefit rules in place and information available in the underlying

datasets. Instruments which are not simulated (mainly contributory pensions), as well as market

income are taken directly from the data (Sutherland and Figari, 2013). As such, EUROMOD is of value

in terms of assessing the first order effects of tax-benefit policies and in understanding how tax-

benefit policy reforms may affect income distribution, work incentives and government budgets in

the short term.

Currently EUROMOD runs on the EU-SILC data, which has only limited information on wealth and

income from wealth. Incorporating the HFCS-data will allow expanding the policy domains currently

covered in EUROMOD with dimensions like wealth taxation. This will enable simulations relating to

issues like a tax shift from income to wealth (a currently hotly debated topic in e.g. Belgium). It will

help to understand and measure the redistributive role of these policies, in relation to the other tax-

benefit rules. With subsequent waves of the HFCS coming available, the microsimulation model will

also enable to investigate changes over time and to determine to what extent these are due to

changes in the underlying population or to changes in the policies.

The second purpose of running EUROMOD on HFCS data is to derive a proper measure of disposable

income, as the HFCS contains only gross income amounts which are not suitable for redistributive

analysis. For the first time, this allows us to consider the joint distribution of disposable income,

wealth and consumption based on information coming from the same survey, potentially

comparable across countries and time.

3 HFCS and its perceived advantages over EU-SILC

The Eurosystem Household Finance and Consumption Survey (HFCS) is a new dataset covering

detailed household wealth, gross income and consumption information (Eurosystem Household

Finance and Consumption Network [HFCN from now onwards], 2013a). It is the result of a joint effort

of all National Banks of the Euro zone, three National Statistical Institutes1 and the European Central

Bank (ECB). The first wave was made available to researchers in April 2013 and contains information

on more than 62 000 households in 15 Euro area member states which were surveyed mostly in 2010

and 20112. Ireland and Estonia are not included, but joined in the second wave (fieldwork period is

2014). Moreover, Latvia, who joined the Euro zone on the 1st of January 2014 has also carried out

the survey for the second wave.

An important shortcoming of the direct research use of the HFCS data is that it only covers gross

income amounts which make them for instance unsuitable for the analysis of issues of inequality and

redistribution. Nevertheless, the income components that are covered in the HFCS are largely the

same as those surveyed in EU-SILC. More specifically, the HFCS gross income concept includes the

following components: employee income, self-employment income, rental income from real estate

1 Of France, Finland and Portugal

2 Exceptions are France (2009/2010), Greece (2009) and Spain (2008/2009)

6

property, income from financial investments, income from pensions (public, occupational & private),

regular social transfers, regular private transfers, income from private business and income from

other sources (HFCN, 2013b, p.108). The major differences with the income concept of EU-SILC are

presented in Table 1. First, it is clear that in the category of employee income the HFCS only asks

respondents about cash and near cash income, while EU-SILC also captures non-cash income.

Secondly, pensions from mandatory employer-based schemes are included in public pensions in EU-

SILC, while they are covered under private pensions in the HFCS (HFCN, 2013a, p.100). Finally,

income received by people under 16 is covered in EU-SILC, but not in the HFCS. In contrast, the HFCS

covers income from other types of sources (such as capital gains or losses from the sale of assets,

prize winnings, insurance settlements, severance payments, lump sum payments upon retirement),

while EU-SILC does not. However, considering the joint patterns of income and wealth inequality in

Belgium, Kuypers et al. (2015) show that despite these methodological similarities, a non-negligible

difference in gross income distributional outcomes exists between the HFCS and EU-SILC, mainly at

the top of the distribution, which is arguably the consequence of the oversampling strategy

implemented in the HFCS (see below for more details). Such differences suggest that is not enough to

look at median incomes (HFCN, 2013, p. 100) to provide a reliable comparison between different

surveys.

Table 1: Comparison of gross income components HFCS and EU-SILC

HFCS EU-SILC

Employee income (Cash & near cash income) Employee cash or near cash income

- - - Non-cash employee income

Self-employment income Cash benefits or losses from self-employment

Rental income from real estate property Income from rental of a property or land

Income from financial investments Income from private business other than self-employment

Interest, dividends, profit from capital investments in unincorporated business

Public pensions (old-age pension, survivor pension, disability pension)

Old-age benefits, Survivor benefits, Disability benefits

Occupational & private pensions Pensions from individual private plans

Unemployment benefits Unemployment benefits

Other social transfers (family/children related allowances, housing allowances, education allowances, minimum subsistence, other social benefits)

Family/children related allowances Housing allowances Education-related allowances Sickness benefits Social exclusion not elsewhere classified

Regular private transfers Regular inter-household cash transfer

- - - Income received by people aged under 16

Income from other sources - - -

Source: HFCN (2013) & European Commission

The HFCS dataset contains some very interesting features. First, the very wealthy are oversampled

such that a better coverage of the top of the income and wealth distributions is obtained. This is

necessary because there exist large sampling and non-sampling errors as a consequence of the large

skewness of the wealth distribution. In particular the wealthiest households are less likely to respond

7

and more likely to underreport, especially in the case of financial assets (Davies et al., 2011).

Moreover, it also makes the rather small sample more representative. Hence, in contrast to EU-SILC

which should represent the entire income distribution and is used to identify poor households, the

HFCS focusses on the top of the distribution (HFCN, 2013a, p.98-99). Since taxes typically have a

larger impact on the top of the distribution the implementation of the HFCS in EUROMOD should

lead to more accurate outcomes on the distributional and budgetary effects of taxation. The HFCN

(2013b, p.21) indicates that this oversampling strategy in some countries comes at the expense of

coverage at the bottom of the distribution, but it is not clear to what extent this is the case in

practice. As a consequence, the benefit side of the redistributive system may still be better covered

by EU-SILC.

A second interesting feature of the HFCS data is that a multiple imputation technique was used to

deal with selective item non-response (in the form of five different imputations). In other words,

crucial income and wealth information does not need to be imputed by researchers in the process of

building the database. This imputation is not standardly performed in EU-SILC, implying that the

researcher has to make decisions. Moreover, five different imputations will clearly lead to more

accurate outcomes than a single imputation. The number of covariates used for the imputation,

however, largely differs between countries as well as by income or asset type3. Moreover, the

concrete variables that are used for these imputations are not documented. Therefore, the quality of

imputations for individual countries may be hard to evaluate (Tiefensee & Grabka, 2014).

The largest added value from using the HFCS data as an underlying database for EUROMOD is that it

covers much more detailed information on wealth issues. This will allow the expansion of policy

domains currently covered in EUROMOD with taxation of wealth and income from wealth. In the trial

database created for this paper, however, this was not yet implemented. We only constructed the

database in the same manner as the one based on EU-SILC and as it is currently needed for its

inclusion into EUROMOD to get a distribution of disposable income and to measure the redistributive

effect of the tax-benefit system.

In order to evaluate the suitability of the HFCS as a EUROMOD database we construct a trial database

and validate the main outcomes of running EUROMOD on the HFCS, comparing them with those

obtained using EU-SILC as input database. The HFCS data potentially supplies micro data on 15 euro

area member states. However, the quality and reliability of the HFCS data is not clear yet for all

countries. For Belgium an extensive validation of the HFCS data against external data sources such as

EU-SILC and SHARE indicates that the HFCS is sufficiently reliable for the study of income and wealth

in Belgium (Kuypers et al., 2015). Practical issues in the creation of this database are discussed in the

following part.

3 For example, the imputation of missing values of employee income is based on 224 covariates in Spain, while the

Netherlands use only 5 variables (HFCN, 2013a, p.51).

8

4 The data requirements for EUROMOD and the HFCS: a case study

for Belgium

Figari et al. (2007) list a set of basic data requirements that a database must fulfil in order to be

incorporated in a sensible way in EUROMOD. These are:

- The database used must be a recent, representative sample of households, large enough to

support the analysis of small groups and with weights to apply to population level and

correct for non-response;

- The database must contain information on primary gross incomes by source and at the

individual level, with the reference period being relevant to the assessment periods for taxes

and benefits. When benefits cannot be simulated, information on the amount of these

benefits, gross of taxes, is required for each recipient;

- The database must contain information about individual characteristics and within-

household family relationships;

- It must contain information on housing costs and other expenditures that may affect tax

liabilities or benefit entitlements;

- Specific other information on characteristics affecting tax liabilities or benefit entitlements

(examples include weekly hours of work, disability status, civil servant status, private pension

contributions) is also necessary;

- The same reference period(s) should apply to personal characteristics (e.g. employment

status) and income information (e.g. earnings) corresponding to it. In principle this implies

the recording of status variables for each period within the year;

- There should be no missing information from individual records or for individuals within

households. Where imputations have been necessary, detailed information about how they

were done is necessary.

In general, most of these requirements are met for the HFCS data (see also previous section). We

now provide more details on how the HFCS scores on these requirements for Belgium, which is

presented here as a pilot exercise. We make use of the UDB 1.1 data version of the HFCS (February

2015 release) on which we construct a new dataset containing the mean estimate over the five

imputations for each case where such a multiple imputation was done. We highlight issues of sample

size, reference period, imputation of missing information, the disaggregation of certain variables into

more detailed information, etc.

Sample

The UDB data for Belgium include information on 5,506 individuals living in 2,327 households. They

were surveyed between April and October 2010, so that the reference income period is 2009. The

oversampling of the wealthy was implemented in Belgium based on the NUTS 1 region and the

average income by neighbourhood of residence, which results in an effective oversampling rate of

the top 10% equal to 47 per cent (HFCN, 2013c). As mentioned before, missing information on crucial

variables is multiply imputed, so that in principle the full sample can used for the construction of the

EUROMOD input database. However, following common EUROMOD conventions, in the creation of

the EUROMOD input database children that were born after the end of the income reference period

are deleted from the sample. In the HFCS we only know the age of the individual at the time of the

9

interview, not the year in which they were born. We assume all individuals aged 0 years to be born

after the income reference period. Since most Belgian interviews were done in the second half of

2010 this assumption is relatively acceptable. In case of Belgium it concerns 18 children that are still

in their first year of life. Hence, the final sample covers 5488 individuals. Some descriptive statistics

used for the grossing up to the level of the full population (10.8 million people) are presented in

Table 2. A comparison with those for EU-SILC immediately shows that the HFCS sample is much

smaller and therefore its statistical reliability may be lower.

Table 2: Descriptive statistics of sample and weights

Observations Mean weight Median weight Min weight Max weight

HFCS 5,488 1,961.1 1,274.8 149.7 12,205.7

EU-SILC 14,700 727.1 651.9 97.2 4,523.1

Source: own calculations based on HFCS

Reference period

The HFCS questionnaire asks individuals to declare incomes received in 2009, but all aspects relating

to assets and debt holdings as well as demographic and economic characteristics refer to the time of

the interview. We have to make the assumption that these aspects have not changed compared to

the income reference period. For example, since we only know the age of individuals at the time of

interview and not the birth year we cannot just subtract 1 year for age because we do not know

whether the person has already celebrated its birthday when the interview takes place. Moreover,

we do not know whether an individual has perhaps experienced a change in its labour market status,

marital status, etc. We deem it reasonable to assume that the largest share of individuals has not

experienced a change in their main demographic and economic characteristics, or that such a change

has no large impact on the outcomes. In sum, the practice is basically the same as the one used

when deriving an EU-SILC based EUROMOD input database

Adjustments of variables

With the exception of certain variables, EUROMOD input variables on labour market information,

incomes, benefits, etc. need to be covered at the individual level. As in EU-SILC a number of these

components are surveyed at the household level in the HFCS. In order to divide these between

individuals we followed the same process that was developed for the EU-SILC based input database.

The components for which this applies are:

- Rental income from real estate property

- Income from financial investments

- Income from regular social transfers

- Income from other sources

Important to note is that the EUROMOD variable ‘INCOME: other’ (yot) in the EU-SILC refers to

income received by individuals younger than 16 years, while it refers to income received from other

sources in the HFCS.

Variables that could not be created using the HFCS as an underlying dataset and which are used in

EUROMOD - Belgium are firm size (lfs) for the calculation of employer contributions and Belgian

cadastral income of the own residence (khooo) for the calculation of personal income taxes.

10

Disaggregation of social transfers

In the original HFCS dataset all incomes from regular social transfers (except pensions and

unemployment benefits) are covered under one aggregated variable (HG0110), while EUROMOD

requires all types of benefits to be covered separately. As this variable includes income sources

received both at the individual (e.g. educational allowances) and the household level (e.g. housing

allowances, family benefits,…) and are not mutually exclusive their breakdown into separate

components is not straightforward. Child benefits and social assistance for Belgium can be accurately

simulated in EUROMOD and are considered to be the most important, and also the most widespread

among households in the case of child benefits. Therefore, we opted to create a EUROMOD variable

‘BENEFIT: Other’ (bot) which was set equal to the aggregate reported variable and to simulate the

child benefit and social assistance in EUROMOD4, after which these two values are subtracted from

the aggregate variable. As a result of this process we have three output variables: one containing the

simulated child benefits, one containing the simulated social assistance benefits and one including all

other types of benefits covered in HG0110. Where the simulated benefits turn out to be larger than

the reported amounts, we decided first to use the simulated benefits and assume no other benefits

when the difference between simulated and declared benefits is smaller than 150 euros per month.

Second, those households where no social benefits are declared and there are indications of a

reconstituted family, child benefits are set to zero. This is because in Belgium mothers typically

receive child benefits so that if the father and child are part of a new household in the dataset no

child benefits are received by them. Finally, for the remainder of households the difference between

declared and simulated social benefits is relatively large, with a large share of households with

dependent children not declaring any social benefits (about 85% of the remaining households)5.

Therefore, we assume for all remaining households that they have ‘forgotten’ to declare child

benefits. We simply make use of the simulated child benefits, while the declared benefit amount, if

any, is considered to refer to other types of social benefits.

In case of income received from public pensions all types of benefits are also included in one

aggregated variable (PG0310). In order to obtain separate EUROMOD input variables for old age

benefits, survivor benefits and disability benefits the aggregated variable was imputed. As we

assume these three types of benefits to be mutually exclusive, we imputed old age benefits as all

pension income received from the age of 65 onwards and survivor benefits as those pensions

received under the age of 65 by widowed persons. Finally, disability benefits are all those pensions

received by someone who is ‘permanently disabled’ according to one of its declared labour status.

Imputation of main residence mortgages

The HFCS dataset covers very detailed information on mortgages held for the main residence, among

others the monthly payment that is made. However, EUROMOD requires a specification of the part

that is paid in interest and the capital part. First, it should be noted that wealth information in the

HFCS refers to the time of the survey, hence 2010. We assume that mortgage payments are the same

in 2009 as in 2010. Mortgages that were taken or refinanced in 2010 are not included. For

households that refinanced their loan in 2010 we lose the information on the fact that they did have

a mortgage in 2009. However, we do not know the specificities such as the old interest rate of this

4 Similarly to the EU-SILC based simulations, the amounts of social assistance are adjusted for non-take-up of benefits with a

random non take-up correction. 5 Overall, approximately 46.7% of households with dependent children do not report any social benefits.

11

loan. This only involves 8 households. Second, we assume all households to have made a payment

during 12 months. This could be a problem if the mortgage was taken or expired in the income

reference period. Furthermore, we used the following formula to split the mortgage repayment into

an interest and a capital part:

𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 𝑝𝑎𝑟𝑡 = 𝑟𝑒𝑝𝑎𝑦𝑚𝑒𝑛𝑡 ∗ [1 − (1 + 𝑖)(𝑘−𝑛−1)]

Where 𝑖 refers to the interest rate, 𝑛 to the duration of the mortgage and 𝑘 to the time of the

mortgage period that already passed. Subtracting this interest part from the repayment amount

gives the capital part. We have detailed information on duration and interest rate for the first two

mortgages only. For the third mortgage onwards we only have information on the monthly

repayment. We opt to apply the parameters from the second mortgage on the payment for the other

mortgages. In the Belgian sample 16 households have a third mortgage for their main residence.

Missing regional information

Unfortunately the HFCS UDB data do not include information on the region households live in. In the

construction of the trial database we arbitrarily assumed all household to live in Flanders as this

region has the largest population share and this will have the smallest impact on the EUROMOD

outcomes. However, the EUROMOD module simulating the Flemish contribution to care insurance

was not switched on, because all households would then be eligible to pay this specific contribution6.

In other cases the effect on outcomes is probably smaller as they are applicable in all regions but only

differ in level of rates. The lack of NUTS1 information, however, will turn out to be a large problem in

the future, because the sixth state reform involves a substantial transfer of tax-benefit competences

from the federal to the regional level.

5 Simulating net incomes using the Belgian HFCS data

In this part we discuss the outcomes from the EUROMOD process based on the Belgian HFCS data.

Moreover, in order to validate these results we compare them to those obtained by the EU-SILC

database and where possible to other available sources.

We start by analysing how well the HFCS and EU-SILC data represent the Belgian population. Table 3

provides an overview of some basic socio-demographic indicators. Overall, the HFCS and EU-SILC

data appear to represent the Belgian population in a similar way. While the age and gender

distribution are highly similar, the results for highest education achieved and tenure status are found

to diverge slightly more. Relating to the latter, both surveys cover about the same home-ownership

rate, but they differ somewhat in their subdivision on mortgage holdings. Comparing the outcomes

for both surveys with those of external sources, however, indicates that both samples are not

completely representative for the Belgian population. While the HFCS more closely represents the

distribution of highest education achieved than EU-SILC, it appears to underestimate the share of

self-employment. However, in the HFCS respondents can declare several labour statuses. In our

6 As this contribution amounts to maximum 50 euro per year, the overall effect of this omission is probably negligible.

12

definition of labour market status we used only the one that was reported first. It is possible that an

individual declares self-employment as the second, third, etc. labour status at the time of interview

(2010), but has worked in self-employment throughout the main part of the income reference period

(2009) (see discussion of reference period above).

Table 3: Comparison of socio-demographic characteristics between HFCS, EU-SILC and EUROSTAT HFCS 2009 EU-SILC 2009 External 2009

Age 0-15 16-29 30-44 45-64 65+

17.7 17.5 21.3 26.4 17.2

18.2 17.5 21.1 27.0 16.2

18.1 17.4 20.9 27.5 16.1

Gender Female Male

51.0 49.0

50.8 49.2

51.0 49.0

Highest education achieved (*) Not completed primary Primary Lower secondary Upper secondary Post-secondary Tertiary

12.8 11.5 16.0 30.9 N/A 28.8

18.2 12.8 18.0 25.1 1.8

24.1

19.2 (*) 20.2 34.6 N/A 26.1

Labour market status Pre-school Employer or self-employed Employee Pensioner Unemployed Student Inactive Sick or disabled Other Family worker

5.9 3.6

36.4 21.0 6.5

19.8 0.0 2.4 4.2 0.1

7.3 4.1

35.7 18.6 5.1

18.1 1.4 3.0 6.5 0.2

(**) 6.0

33.1 17.7 4.9

22.9 (**) 1.0 2.2

11.6 0.6

Marital status Single Married Separated Divorced Widowed

46.6 40.8 N/A 6.6 6.0

44.6 40.5 0.3 8.8 5.8

43.6 41.7 N/A 8.1 6.6

Tenure status Owned on mortgage Owned outright Rented Reduced rented Social rented Free user

37.4 36.2 24.7 N/A N/A 1.7

41.6 30.2 19.5 7.2 N/A 1.4

69.0 (***)

28.8 (***)

Notes: outcomes are estimated for the EUROMOD sample, not the survey sample; (*) external data only on persons aged 15 years and over, first figure refers to joint category of not completed primary and primary education; (**) includes all children that are entitled to child benefits; (***) figures for 2011, only breakdown into owned versus rented house is

13

available, rest category (2.2%) refers among others to collective residential accommodations which are typically not part of a survey sample Source: own calculations and external sources: age, gender and marital status: EUROSTAT based on CENSUS, education attained: FOD Economics, Department Statistics, labour market status: Data warehouse labour market and social protection of the Crossroads Bank for Social Security, tenure status: CENSUS 2011

Now we move on to the analysis of the distributional outcomes. All figures in this part are computed

for individuals based on their household disposable income equivalised by the OECD modified scale7

and expressed in annual terms, unless mentioned otherwise.

Table 4 shows the outcomes of EUROMOD based on the HFCS in terms of income inequality. We

show outcomes for disposable as well as original income (including pensions), the difference

between them being the inclusion of taxes, social insurance contributions and benefits. The HFCS

median is found to be very similar to the EU-SILC median (see also HFCN (2013, p. 100) for a cross-

country evidence), while mean estimates appear to be somewhat higher based on the HFCS,

especially for original income. Nevertheless, the comparison of inequality indices requires more

attention and further investigations as they show a rather large discrepancy between the outcomes

of the HFCS and EU-SILC. As we will discuss below, this will likely be the consequence of the

oversampling strategy applied in the HFCS. The Kakwani measure for progressivity is shown in Table

5. As expected due to the oversampling strategy, all components of the tax-benefit system are found

to be more progressive in the HFCS compared to the EU-SILC database.

Table 4: Comparison of income inequality indicators between HFCS and EU-SILC EUROMOD 2009

based on HFCS EUROMOD 2009 based on EU-SILC

EU-SILC incomes 2009

Disposable income

Mean 21,995 20,036 21,201 Median 18,977 18,919 19,469 Gini coefficient 0.3172 0.2259 0.2602 Income quintile ratio (S80/S20) 5.00 3.21 3.82

Original income

Mean 29,435 25,247 25,635 Median 22,263 22,638 22,917 Gini coefficient 0.4655 0.3771 0.3792 Income quintile ratio (S80/S20) 17.05 11.08 11.29

Source: own calculations

Table 5: Comparison of progressivity HFCS and EU-SILC

Kakwani index EUROMOD 2009 based

on HFCS EUROMOD 2009 based

on EU-SILC Taxes 0.3737 0.2811

Social insurance contributions 0.2557 0.2176

Social benefits 0.3140 0.2880 Source: own calculations

7 The OECD equivalence scale is constructed by giving the first adult a weight 1, any additional individuals aged 14 years or

over 0.5, while individuals younger than 14 count for 0.3.

14

In Table 6 we show how the impact of the tax-benefit system is distributed across deciles. We find

that the difference in inequality is mainly driven by divergence at the top and the bottom of the

income distribution. While the average equivalised disposable income in the 10th decile is equal to

€58,545 based on the HFCS, it is only €37,580 for EU-SILC. The difference in average disposable

income in the bottom decile is approximately 33% higher in EU-SILC than in the HFCS. Moreover,

differences are mainly found with regard to taxes and social insurance contributions, which are

typically based on the income level, while outcomes for the benefits that are received are much

more similar as eligibility is often based on non-monetary aspects such as the presence of children in

order to qualify for child benefits for instance. This again indicates that the difference in outcomes

between the two surveys can mainly be attributed to the HFCS oversampling strategy.

Table 6: Comparison between HFCS and EU-SILC of averages of different components by decile of equivalised disposable income

Decile

Disposable income

Original income Benefits Taxes Social insurance

contributions

EUROMOD 2009 based on HFCS

1 6,177 2,657 3,700 -26 206 2 11,215 8,466 3,588 201 637 3 13,725 12,884 2,567 751 976 4 15,869 16,683 2,737 1,863 1,689 5 18,021 20,463 2,516 2,933 2,025 6 19,958 23,844 2,817 4,062 2,641 7 22,497 30,071 2,063 6,138 3,499 8 25,110 35,073 2,090 7,859 4,195 9 29,043 44,248 1,315 11,244 5,276

10 58,545 100,382 2,183 34,731 9,289

Total 21,995 29,435 2,559 6,961 3,038

EUROMOD 2009 based on EU-SILC

1 8,235 4,191 4,452 58 350 2 12,089 9,304 3,952 420 747 3 14,210 13,134 3,287 1,149 1,062 4 16,181 16,704 3,225 2,057 1,691 5 18,090 20,184 3,117 3,054 2,156 6 19,887 24,459 2,662 4,368 2,866 7 21,960 28,485 2,418 5,601 3,342 8 24,408 33,583 2,446 7,493 4,128 9 27,744 40,812 2,084 10,217 4,934

10 37,580 61,668 2,298 19,335 7,051

Total 20,036 25,247 2,994 5,373 2,832 Source: own calculations

Table 7 shows a comparison of poverty rates for several poverty thresholds. At each poverty line the

share of poor individuals is higher in the HFCS compared to SILC, although the gap decreases slightly

at higher poverty thresholds. However, the outcomes are closer to the poverty rates based on

reported EU-SILC disposable incomes. It is well-known that disposable income simulated in

EUROMOD differs to an important extent from the reported disposable incomes in the UDB EU-SILC

data (Hufkens et al., 2014). At the official threshold of 60% of median equivalised disposable income

the poverty rate is equal to 15.3% for the HFCS EUROMOD database, 11.7% for the EU-SILC

EUROMOD database and 14.6% for the UDB EU-SILC. As was shown in Table 4 the HFCS and EU-SILC

15

medians are similar. Hence, the difference in poverty rates can hardly be attributed to a difference in

poverty thresholds. However, Table 6 showed that HFCS figures of net disposable incomes are lower

at the bottom of the distribution and higher at the top of the distribution compared to the EU-SILC

outcomes. This directly impacts the number of individuals below 60% of median income.

If we look at the distribution of poverty across age categories in Table 8, it appears that the

overrepresentation of poor individuals in the HFCS compared to EU-SILC is not evenly distributed by

age. While the number of poor individuals is relatively similar for the two surveys in the age

categories of 0-15 and 65+, the poverty rates in the other three age categories are much larger,

especially for individuals aged between 16 and 29.

Table 7: Comparison of poverty rates at different poverty lines between HFCS and EU-SILC Percentage of individuals below: EUROMOD 2009

based on HFCS EUROMOD 2009 based on EU-SILC

EUROSTAT (EU-SILC)

40% of median equivalised disposable income

5.4% 2.3% 4.1%


9.5% 5.5% 7.9%


15.3% 11.7% 14.6%


22.6% 19.8% 23.8%

Source: Hufkens et al. (2014) and own calculations

Table 8: Comparison of poverty rates by age group between HFCS and EU-SILC Age group: EUROMOD 2009 based

on HFCS EUROMOD 2009 based

on EU-SILC EU-SILC incomes 2009

0-15 14.8% 14.8% 18.5% 16-29 20.8% 11.1% 14.3% 30-44 15.7% 9.4% 11.4% 45-64 13.9% 9.5% 11.6% 65+ 12.2% 15.3% 19.5%

Note: poverty line is set at 60% of median equivalised income Source: Hufkens et al. (2014) and own calculations

In short, this preliminary validation exercise of a EUROMOD database constructed on the HFCS

indicates that outcomes based on simulated disposable incomes are reasonable. This is in line with

Kuypers et al. (2015), which include a similar validation exercise for gross incomes comparing HFCS

Belgium with the EU-SILC and the SHARE database. There are, however, some remarkable differences

which warrant further investigation. The largest discrepancies are found with regard to the level of

inequality, which is found to be largely driven by divergences at the top of the distribution, which in

turn is assumed to be the consequence of the HFCS oversampling strategy. Kennickell (2008) and

Bover (2008) argue that on top of its correction for nonresponse oversampling of the wealthy also

provides more precise estimates of wealth in general and of narrowly held assets as standard errors

are much smaller. Since the income and wealth distributions are highly correlated, especially at the

top (e.g. Alvaredo et al., 2013), oversampling will also result in more accurate estimates of the top of

the income distribution as well as of income sources that are typically received by a select group.

Therefore, we expect the HFCS to capture the level of inequality more closely to reality than EU-SILC.

Vermeulen (2014), however, shows that despite the oversampling strategy wealth shares of the top 5

16

and 1% are still underestimated. It is not clear whether this is also the case for the income

distribution.

Some particular aspects should be borne in mind in the use of the HFCS-EUROMOD database and the

interpretation of its outcomes. First, the HFCS sample is considerably smaller than the EU-SILC

sample. Therefore one should be careful in interpreting results for small subgroups. Second, an

analysis of some socio-demographic characteristics indicated that the sample is not fully

representative for the Belgian population. Most importantly the HFCS might slightly underestimate

the share of self-employment as main labour status. The largest limitation of the HFCS, however, is

the fact that the income reference period and the reference time of other aspects does not coincide.

Moreover, the reference period also differs between separate countries, which will complicate cross-

country analyses.

6 Conclusion

This paper explores the feasibility of considering the HFCS data as an underlying database for the

European tax-benefit model EUROMOD. We created a trial database for Belgium and validated some

aggregate results by comparing outcomes to those obtained when EU-SILC is used as underlying

database as well as to external databases. These first results indicate that it is feasible to use the

HFCS database as EUROMOD input data, despite some of the outcomes need further investigation.

The main differences exist with regard to the level of inequality, which is found to be largely driven

by divergences at the top of the distribution, which in turn is assumed to be the consequence of the

HFCS oversampling strategy. As our discussion above indicated, the oversampling of wealthy

households might result in more accurate estimates of income and wealth at the top. Another

conclusion from our research is that a comparison of results between EU-SILC and the HFCS cannot

be based just on medians alone. It is important to look at the distribution, as our outcomes show that

there are some discrepancies at especially the bottom and the top of the distribution. The reasons

for these discrepancies should be investigated in more depth.

Hence, our preliminary conclusion is that, although transforming the HFCS into a database for

EUROMOD would require a significant amount of effort and the simulation results require a detailed

scrutiny to assess their reliability against external statistics and results based different input data,

this is surely to be worthwhile because of the interesting possibilities to extend the policy scope of

EUROMOD and also to consider the joint distribution of disposable income, wealth and consumption.

In a future extension of this paper a second trial database for Italy will be constructed. Since the

HFCS data for Italy originate from the conversion of an existing national survey (i.e. Survey on

Household Income and Wealth (SHIW)) the strengths and weaknesses of these data are well known.

Moreover, much more variables are available for Italy, such as imputed rent and net incomes for

instance, which will largely contribute to the validation of using the HFCS as an underlying database

for tax-benefit microsimulation in EUROMOD.

17

7 References

Alvaredo, F., Atkinson, A. B., Piketty, T., & Saez, E. (2013). The top 1 percent in international and

historical perspective. Journal of Economic Perspectives, 27(3), 3-20.

Bover, O. (2008). Oversampling of the wealthy in the Spanish Survey of Household Finances (EFF).

Irving Fisher Committee Bulletin, 28, pp. 399-402.

Davies, J. B. (2009). Wealth and economic inequality. In W. Salverda, B. Nolan, & T. M. Smeeding, The

Oxford Handbook of economic inequality (pp. 127-149). Oxford: Oxford University Press.

Davies, J. B., Sandström, S., Shorrocks, A., & Wolff, E. N. (2011). The level and distribution of global

household wealth. The Economic Journal, 121(551), 223-254.

Eurosystem Household Finance and Consumption Network. (2013a). The Eurosystem Household

Finance and Consumption Survey - Methodological report for the first wave. ECB Statistics

Paper No1, 112p.

Eurosystem Household Finance and Consumption Network. (2013b). The Eurosystem Household

Finance and Consumption Survey - Results from the first wave. ECB Statistics Paper No2,

112p.

Figari, F. (2013). Should we make the richest pay to meet fiscal adjustment needs? - Discussion. The

role of tax policy in times of fiscal consolidation (pp. 103-107). European Economy, Economic

Papers 502.

Figari, F., Levy, H., & Sutherland, H. (2013). Using the EU-SILC for policy simulation: Prospects, some

limitations and some suggestions. Comparative EU Statistics on Income and Living Conditions:

Issues and challenges (pp. 345-373). Eurostat Methodologies and Working Papers, European

Communities.

Figari, F., Paulus, A., & Sutherland, H. (2015). Microsimulation and policy analysis. In A. B. Atkinson, &

F. Bourguignon, Handbook of Income Distribution Volume 2B. Amsterdam: Elsevier-North

Holland.

Hills, J. (2013). Safeguarding social equity during fiscal consolidation: which tax bases to use? The role

of tax policy in times of fiscal consolidation (pp. 80-91). European Economy, Economic Papers

502.

Hufkens, T., Spiritus, K., & Vanhille, J. (2014). EUROMOD Country Report Belgium 2009-2013.

Jäntti, M., Sierminska, E., & Van Kerm, P. (2013). The joint distribution of income and wealth. In J. C.

Gornick, & M. Jäntti, Income inequality. Economic disparities and the middle class in affluent

countries (pp. 312-333). Stanford: Stanford University Press.

Kennickell, A. B. (2008). The role of oversampling of the wealthy in the Survey of Consumer Finances.

Irving Fisher Committee Bulletin, 28, pp. 403-408.

18

Kuypers, S., Marx, I., & Verbist, G. (2015). Joint patterns of income and wealth inequality in Belgium.

Unpublished manuscript.

Piketty, T. (2011). On the long-run evolution of inheritance: France 1820-2050. The Quarterly Journal

of Economics, 126(3), 1071-1131.

Piketty, T. (2013). Should we make the richest pay to meet fiscal adjustment needs? The role of tax

policy in times of fiscal consolidation (pp. 99-102). European Economy, Economic Papers 502.

Piketty, T. (2014). Capital in the Twenty-First Century. Harvard, USA: Harvard University Press.

Piketty, T., & Saez, E. (2013). Top incomes and the Great Recession: Recent evolutions and policy

implications. IMF Economic Review, 61, 456-478.

Piketty, T., & Zucman, G. (2014). Capital is back: Wealth-income ratios in rich countries, 1700-2010.

The Quarterly Journal of Economics, 129(3), 1255-1310.

Stiglitz, J. E., Sen, A., & Fitoussi, J.-P. (2011). Report by the Commission on the Measurement of

Economic Performance and Social Progress.

Sutherland, H., & Figari, F. (2013). EUROMOD: the European Union tax-benefit microsimulation

model. International Journal of Microsimulation, 6(1), 4-26.

Tiefensee, A., & Grabka, M. M. (2014). Comparing wealth - Data quality of the HFCS. DIW Berlin

Discussion Paper No 1427.

Vermeulen, P. (2014). How fat is the top tail of the wealth distribution? ECB Working Paper No1692.

using the household finance and consumption survey (hfcs ...taxation (piketty, 2011, 2014) and the...

Documents