using the household finance and consumption survey (hfcs ...taxation (piketty, 2011, 2014) and the...
TRANSCRIPT
1
Paper for the Sixth Meeting of the Society for the Study of Economic Inequality (ECINEQ), July 13-15,
2015, Luxembourg
Using the Household Finance and Consumption Survey (HFCS) for a joint
assessment of income and wealth taxes: Prospects, limitations and
suggestions for policy simulations
<Draft paper, please do not quote>
Francesco Figari1, Sarah Kuypers2, Gerlinde Verbist2 1 University of Insubria and ISER University of Essex
2 Centre for Social Policy, University of Antwerp
Abstract We explore the prospects for using the Eurosystem Household Finance and Consumption Survey
(HFCS) dataset as an underlying micro-database for policy simulation across euro zone countries. In
particular, we consider the issues to be addressed and the advantages arising from building a
database from the HFCS for the EU tax-benefit model, EUROMOD. EUROMOD is currently running
mostly on EU-SILC data, but is built in a way that maximises its flexibility and possibility to simulate
tax-benefit policies on different databases. This will allow expanding the policy domains currently
covered in EUROMOD with dimensions like wealth taxation, which recently gained much
prominence, in the academic as well as the public debate. As the HFCS only contains gross income
amounts which are not suitable for redistributive analysis, the purpose of this paper is to derive net
incomes by simulating the gross-to-net transition with EUROMOD taking into account all important
details of the social security and personal income system. In order to identify the issues and illustrate
their importance a trial database for Belgium is constructed. We conclude that, although
transforming the HFCS into a database for EUROMOD would require a significant amount of effort,
this is surely to be worthwhile because of the interesting possibilities to extend the policy scope of
EUROMOD and to consider jointly the redistributive effect of income and wealth taxes. Moreover,
the derivation of disposable income allows one to consider the joint distribution of income, wealth
and consumption, which can be used to analyse issues relating to inequality and poverty.
Key words: EUROMOD, HFCS, simulations, gross-to-net incomes, wealth taxation
JEL Classification: C15, H24, I3
2
Using the Household Finance and Consumption Survey (HFCS) for a joint
assessment income and wealth taxes: Prospects, limitations and
suggestions for policy simulations
Francesco Figari1, Sarah Kuypers2, Gerlinde Verbist2 1 University of Insubria and ISER University of Essex
2 Centre for Social Policy, University of Antwerp
1 Introduction
The increasing accumulation of private wealth in Europe appears as one of the most striking
evolutions in the distributional literature over the last 40 years. The aggregate private wealth-
national income ratios have nowadays returned to levels observed in the 19th century, ranging from
400% to 700 %. Such levels are determined by different economic factors, such as the long-run asset
price recovery effect, high saving rates and low economic growth rates, at least partially sustained by
pro-capital policies (Piketty and Saez, 2013). Focusing on household resources the ratios between
private wealth and disposable income are even higher; the rise of freely provided public services and
in-kind transfers such as health and education that occurred since the 1970s is an important factor in
explaining why disposable income has declined relative to national income (Piketty and Zucman,
2014).
High wealth-income ratios are not necessarily bad but they raise challenging issues about capital
taxation (Piketty, 2011, 2014) and the overall structure of inequality (Davies, 2009). First, while in the
last 50 years the contribution of wealth taxes to government revenues has diminished, there are
strong arguments nowadays for broadening the existing tax bases to include wealth and income from
wealth both on horizontal and vertical equity grounds. As long as wealth is more unequally
distributed than income, wealth taxes are attractive in distributional terms. Moreover, as far as
economic efficiency is concerned, wealth taxes minimise economic distortions by taxing fixed factors
(Hills, 2013). In a recent contribution, Piketty (2013) proposes to go beyond the national boundaries
and suggests the introduction of a comprehensive wealth tax at the European level, based on the
market value of the net personal worth. Second, given that wealth is in general very concentrated
(with a Gini coefficient ranging between 0.5 and 0.8 over time and across countries) and correlated
with income, the inequality of wealth is likely to exacerbate overall inequality. Taxing wealth is a way
to reduce this inequality. Hence, it is important to assess the role of the different wealth components
across countries, in order to set appropriate tax-free allowances and concentrate the tax burden on
the wealthy part of the population, given the increasing role of housing assets in the household’s
portfolio along the entire income distribution (Figari, 2013).
In such a context the need for more comprehensive and integrated data on individual well-being is
widely recognised, as highlighted in the Report by the Commission on the Measurement of Economic
Performance and Social Progress (Stiglitz, Sen and Fitoussi, 2009). In order to identify better
3
measures of economic performance in a complex economy and thus going Beyond GDP, Stiglitz, Sen
and Fitoussi recommend to consider income, consumption and wealth and to give more prominence
to their joint distribution. New household surveys as those developed as part of the Luxembourg
Wealth Study (Jäntti et al. 2013) and the Eurosystem Household Finance and Consumption Network
(HFCN, 2013a) represent a milestone in this ongoing process to better measure individual well-being.
Nevertheless, empirical research faces important and severe limitations that limit potential future
attainments mainly due to data availability. This paper aims at contributing to the recent
developments in this area by exploring the prospects for using the Eurosystem Household Finance
and Consumption Survey (HFCS) dataset as an underlying database for a tax-benefit microsimulation
model. In particular, we consider the issues to be addressed and the advantages arising from building
a database from the HFCS for the EU-wide tax-benefit model, EUROMOD. Although it is currently
running mostly on EU-SILC data, EUROMOD is built in a way that maximises its flexibility and
possibility to simulate tax-benefit policies on different databases (Sutherland and Figari, 2013).
The main advantages of incorporating the HFCS data in EUROMOD are twofold. First, it allows us to
expand the policy domains currently covered in EUROMOD with dimensions like wealth taxation,
which recently gained much prominence in the academic as well as the public debate. In addition to
budgetary and distributional analysis of current wealth taxes, the model based on HFCS data would
allow for an integrated assessment of taxable capacity taking into account direct taxes on income
and wealth and tackling challenging issues such as those faced by ‘asset rich/income poor’
households (Hills, 2013). Moreover, it would enable to estimate the impact of reforms in wealth
taxation in interaction with other tax-benefit policies. Second, as the HFCS contains only gross
income amounts which are not suitable for redistributive analysis, we derive net incomes by
simulating the gross-to-net transition with EUROMOD taking into account all important details of the
social security and personal income system. For the first time, this allows us to consider the joint
distribution of disposable income, wealth and consumption based on information coming from the
same survey, potentially comparable across countries and time.
In order to identify the issues and illustrate their importance a trial database for Belgium is
constructed. We conclude that, although transforming the HFCS into a database for EUROMOD
would require a significant amount of effort, this is surely to be worthwhile because of the
interesting possibilities to extend the policy scope of EUROMOD and also to consider jointly the
redistributive effects of income and wealth taxes. Moreover, the derivation of disposable income
allows one to consider the joint distribution of income, wealth and consumption.
In the next section we briefly describe the advantages and limitations of tax-benefit models. In
section 3 we discuss what the HFCS data can contribute to policy simulation in EUROMOD. In section
4 we discuss the assumptions and transformations needed to construct a EUROMOD database on the
basis of the HFCS data, where Belgium is used as a case study. Section 5 then studies the results of
the derivation of net incomes for the HFCS data and validates them against the EU-SILC and where
possible external sources. The last section concludes.
4
2 Purpose of a tax-benefit model on the HFCS data
The main advantage of a tax-benefit model is that it allows one to focus quite accurately on the
objectives of social and economic policy, on the tools employed, and on the structural change
experienced by those to whom the measures apply. Unlike a macroeconomic model, a
microsimulation model allows one to simulate individual decision units. These decision units are in
the case of the HFCS data households and the individuals that live in them. Fiscal rules are
incorporated into the model as accurately as possible, so that the impact on the individual
characteristics of a decision unit becomes apparent; the impact of social security and taxation may,
after all, vary considerably for different units. The various decision units may also be aggregated
according to different characteristics (e.g. age, social and professional category). As such, the model
allows one to test the redistributive potential of different tax-benefit systems, while taking due
account of social and demographic variables. Another important advantage of this method is that it
allows one to study a set of policy measures from two distinct perspectives. On the one hand, one
can focus on the cumulative effect of the various measures, and therefore also on the impact of the
entire set of transfer-oriented measures. On the other hand, a microsimulation model offers the
possibility of dissecting complex measures (e.g. step-by-step tax calculation for a household), so that
the impact of each step may be considered separately.
As described in Figari et al. (2015) different types of analysis are facilitated by using a
microsimulation approach, among else:
- impact of tax-benefit policy changes (e.g. reforms regarding wealth and income taxation) on
income-based indicators and related statistics (e.g. poverty and inequality indicators);
- impact of demographic factors on disposable income through the effects of tax-benefit
policies (e.g. public support to families contingent on presence of children, see Figari et al.,
2011);
- impact of policy changes over time (e.g. profiles of gainers and losers of a policy indexation
and policy reforms);
- impact of policy changes on social indicators capturing work incentives (e.g. effective
marginal tax rates, participation tax rates) or social inclusion (e.g. multiple deprivation).
Of course, simulation models also have inherent limitations. These models use empirical data that
are either obtained by means of surveys or from administrative sources. As such, the accuracy of the
results depends on the quality of the data (e.g. adequate information about the relevant socio-
economic characteristics, a sufficiently large sample). Another limitation is the cost involved in
constructing and maintaining such a model: developing a tax-benefit model requires time and
money. Therefore, one will need to make certain considerations in terms of policy areas covered,
incorporation of demographic and macro-economic processes or behavioural reactions.
In order to exploit the cross-country dimension of the HFCS data, it is quite natural to build a
database from the HFCS for EUROMOD, the EU-wide tax-benefit model, rather than for separate
national tax-benefit models. Moreover, EUROMOD is built in a way that maximises its flexibility and
possibility to simulate tax-benefit policies on different databases.
5
EUROMOD simulates cash benefit entitlements and direct tax and social insurance contribution
liabilities on the basis of the tax-benefit rules in place and information available in the underlying
datasets. Instruments which are not simulated (mainly contributory pensions), as well as market
income are taken directly from the data (Sutherland and Figari, 2013). As such, EUROMOD is of value
in terms of assessing the first order effects of tax-benefit policies and in understanding how tax-
benefit policy reforms may affect income distribution, work incentives and government budgets in
the short term.
Currently EUROMOD runs on the EU-SILC data, which has only limited information on wealth and
income from wealth. Incorporating the HFCS-data will allow expanding the policy domains currently
covered in EUROMOD with dimensions like wealth taxation. This will enable simulations relating to
issues like a tax shift from income to wealth (a currently hotly debated topic in e.g. Belgium). It will
help to understand and measure the redistributive role of these policies, in relation to the other tax-
benefit rules. With subsequent waves of the HFCS coming available, the microsimulation model will
also enable to investigate changes over time and to determine to what extent these are due to
changes in the underlying population or to changes in the policies.
The second purpose of running EUROMOD on HFCS data is to derive a proper measure of disposable
income, as the HFCS contains only gross income amounts which are not suitable for redistributive
analysis. For the first time, this allows us to consider the joint distribution of disposable income,
wealth and consumption based on information coming from the same survey, potentially
comparable across countries and time.
3 HFCS and its perceived advantages over EU-SILC
The Eurosystem Household Finance and Consumption Survey (HFCS) is a new dataset covering
detailed household wealth, gross income and consumption information (Eurosystem Household
Finance and Consumption Network [HFCN from now onwards], 2013a). It is the result of a joint effort
of all National Banks of the Euro zone, three National Statistical Institutes1 and the European Central
Bank (ECB). The first wave was made available to researchers in April 2013 and contains information
on more than 62 000 households in 15 Euro area member states which were surveyed mostly in 2010
and 20112. Ireland and Estonia are not included, but joined in the second wave (fieldwork period is
2014). Moreover, Latvia, who joined the Euro zone on the 1st of January 2014 has also carried out
the survey for the second wave.
An important shortcoming of the direct research use of the HFCS data is that it only covers gross
income amounts which make them for instance unsuitable for the analysis of issues of inequality and
redistribution. Nevertheless, the income components that are covered in the HFCS are largely the
same as those surveyed in EU-SILC. More specifically, the HFCS gross income concept includes the
following components: employee income, self-employment income, rental income from real estate
1 Of France, Finland and Portugal
2 Exceptions are France (2009/2010), Greece (2009) and Spain (2008/2009)
6
property, income from financial investments, income from pensions (public, occupational & private),
regular social transfers, regular private transfers, income from private business and income from
other sources (HFCN, 2013b, p.108). The major differences with the income concept of EU-SILC are
presented in Table 1. First, it is clear that in the category of employee income the HFCS only asks
respondents about cash and near cash income, while EU-SILC also captures non-cash income.
Secondly, pensions from mandatory employer-based schemes are included in public pensions in EU-
SILC, while they are covered under private pensions in the HFCS (HFCN, 2013a, p.100). Finally,
income received by people under 16 is covered in EU-SILC, but not in the HFCS. In contrast, the HFCS
covers income from other types of sources (such as capital gains or losses from the sale of assets,
prize winnings, insurance settlements, severance payments, lump sum payments upon retirement),
while EU-SILC does not. However, considering the joint patterns of income and wealth inequality in
Belgium, Kuypers et al. (2015) show that despite these methodological similarities, a non-negligible
difference in gross income distributional outcomes exists between the HFCS and EU-SILC, mainly at
the top of the distribution, which is arguably the consequence of the oversampling strategy
implemented in the HFCS (see below for more details). Such differences suggest that is not enough to
look at median incomes (HFCN, 2013, p. 100) to provide a reliable comparison between different
surveys.
Table 1: Comparison of gross income components HFCS and EU-SILC
HFCS EU-SILC
Employee income (Cash & near cash income) Employee cash or near cash income
- - - Non-cash employee income
Self-employment income Cash benefits or losses from self-employment
Rental income from real estate property Income from rental of a property or land
Income from financial investments Income from private business other than self-employment
Interest, dividends, profit from capital investments in unincorporated business
Public pensions (old-age pension, survivor pension, disability pension)
Old-age benefits, Survivor benefits, Disability benefits
Occupational & private pensions Pensions from individual private plans
Unemployment benefits Unemployment benefits
Other social transfers (family/children related allowances, housing allowances, education allowances, minimum subsistence, other social benefits)
Family/children related allowances Housing allowances Education-related allowances Sickness benefits Social exclusion not elsewhere classified
Regular private transfers Regular inter-household cash transfer
- - - Income received by people aged under 16
Income from other sources - - -
Source: HFCN (2013) & European Commission
The HFCS dataset contains some very interesting features. First, the very wealthy are oversampled
such that a better coverage of the top of the income and wealth distributions is obtained. This is
necessary because there exist large sampling and non-sampling errors as a consequence of the large
skewness of the wealth distribution. In particular the wealthiest households are less likely to respond
7
and more likely to underreport, especially in the case of financial assets (Davies et al., 2011).
Moreover, it also makes the rather small sample more representative. Hence, in contrast to EU-SILC
which should represent the entire income distribution and is used to identify poor households, the
HFCS focusses on the top of the distribution (HFCN, 2013a, p.98-99). Since taxes typically have a
larger impact on the top of the distribution the implementation of the HFCS in EUROMOD should
lead to more accurate outcomes on the distributional and budgetary effects of taxation. The HFCN
(2013b, p.21) indicates that this oversampling strategy in some countries comes at the expense of
coverage at the bottom of the distribution, but it is not clear to what extent this is the case in
practice. As a consequence, the benefit side of the redistributive system may still be better covered
by EU-SILC.
A second interesting feature of the HFCS data is that a multiple imputation technique was used to
deal with selective item non-response (in the form of five different imputations). In other words,
crucial income and wealth information does not need to be imputed by researchers in the process of
building the database. This imputation is not standardly performed in EU-SILC, implying that the
researcher has to make decisions. Moreover, five different imputations will clearly lead to more
accurate outcomes than a single imputation. The number of covariates used for the imputation,
however, largely differs between countries as well as by income or asset type3. Moreover, the
concrete variables that are used for these imputations are not documented. Therefore, the quality of
imputations for individual countries may be hard to evaluate (Tiefensee & Grabka, 2014).
The largest added value from using the HFCS data as an underlying database for EUROMOD is that it
covers much more detailed information on wealth issues. This will allow the expansion of policy
domains currently covered in EUROMOD with taxation of wealth and income from wealth. In the trial
database created for this paper, however, this was not yet implemented. We only constructed the
database in the same manner as the one based on EU-SILC and as it is currently needed for its
inclusion into EUROMOD to get a distribution of disposable income and to measure the redistributive
effect of the tax-benefit system.
In order to evaluate the suitability of the HFCS as a EUROMOD database we construct a trial database
and validate the main outcomes of running EUROMOD on the HFCS, comparing them with those
obtained using EU-SILC as input database. The HFCS data potentially supplies micro data on 15 euro
area member states. However, the quality and reliability of the HFCS data is not clear yet for all
countries. For Belgium an extensive validation of the HFCS data against external data sources such as
EU-SILC and SHARE indicates that the HFCS is sufficiently reliable for the study of income and wealth
in Belgium (Kuypers et al., 2015). Practical issues in the creation of this database are discussed in the
following part.
3 For example, the imputation of missing values of employee income is based on 224 covariates in Spain, while the
Netherlands use only 5 variables (HFCN, 2013a, p.51).
8
4 The data requirements for EUROMOD and the HFCS: a case study
for Belgium
Figari et al. (2007) list a set of basic data requirements that a database must fulfil in order to be
incorporated in a sensible way in EUROMOD. These are:
- The database used must be a recent, representative sample of households, large enough to
support the analysis of small groups and with weights to apply to population level and
correct for non-response;
- The database must contain information on primary gross incomes by source and at the
individual level, with the reference period being relevant to the assessment periods for taxes
and benefits. When benefits cannot be simulated, information on the amount of these
benefits, gross of taxes, is required for each recipient;
- The database must contain information about individual characteristics and within-
household family relationships;
- It must contain information on housing costs and other expenditures that may affect tax
liabilities or benefit entitlements;
- Specific other information on characteristics affecting tax liabilities or benefit entitlements
(examples include weekly hours of work, disability status, civil servant status, private pension
contributions) is also necessary;
- The same reference period(s) should apply to personal characteristics (e.g. employment
status) and income information (e.g. earnings) corresponding to it. In principle this implies
the recording of status variables for each period within the year;
- There should be no missing information from individual records or for individuals within
households. Where imputations have been necessary, detailed information about how they
were done is necessary.
In general, most of these requirements are met for the HFCS data (see also previous section). We
now provide more details on how the HFCS scores on these requirements for Belgium, which is
presented here as a pilot exercise. We make use of the UDB 1.1 data version of the HFCS (February
2015 release) on which we construct a new dataset containing the mean estimate over the five
imputations for each case where such a multiple imputation was done. We highlight issues of sample
size, reference period, imputation of missing information, the disaggregation of certain variables into
more detailed information, etc.
Sample
The UDB data for Belgium include information on 5,506 individuals living in 2,327 households. They
were surveyed between April and October 2010, so that the reference income period is 2009. The
oversampling of the wealthy was implemented in Belgium based on the NUTS 1 region and the
average income by neighbourhood of residence, which results in an effective oversampling rate of
the top 10% equal to 47 per cent (HFCN, 2013c). As mentioned before, missing information on crucial
variables is multiply imputed, so that in principle the full sample can used for the construction of the
EUROMOD input database. However, following common EUROMOD conventions, in the creation of
the EUROMOD input database children that were born after the end of the income reference period
are deleted from the sample. In the HFCS we only know the age of the individual at the time of the
9
interview, not the year in which they were born. We assume all individuals aged 0 years to be born
after the income reference period. Since most Belgian interviews were done in the second half of
2010 this assumption is relatively acceptable. In case of Belgium it concerns 18 children that are still
in their first year of life. Hence, the final sample covers 5488 individuals. Some descriptive statistics
used for the grossing up to the level of the full population (10.8 million people) are presented in
Table 2. A comparison with those for EU-SILC immediately shows that the HFCS sample is much
smaller and therefore its statistical reliability may be lower.
Table 2: Descriptive statistics of sample and weights
Observations Mean weight Median weight Min weight Max weight
HFCS 5,488 1,961.1 1,274.8 149.7 12,205.7
EU-SILC 14,700 727.1 651.9 97.2 4,523.1
Source: own calculations based on HFCS
Reference period
The HFCS questionnaire asks individuals to declare incomes received in 2009, but all aspects relating
to assets and debt holdings as well as demographic and economic characteristics refer to the time of
the interview. We have to make the assumption that these aspects have not changed compared to
the income reference period. For example, since we only know the age of individuals at the time of
interview and not the birth year we cannot just subtract 1 year for age because we do not know
whether the person has already celebrated its birthday when the interview takes place. Moreover,
we do not know whether an individual has perhaps experienced a change in its labour market status,
marital status, etc. We deem it reasonable to assume that the largest share of individuals has not
experienced a change in their main demographic and economic characteristics, or that such a change
has no large impact on the outcomes. In sum, the practice is basically the same as the one used
when deriving an EU-SILC based EUROMOD input database
Adjustments of variables
With the exception of certain variables, EUROMOD input variables on labour market information,
incomes, benefits, etc. need to be covered at the individual level. As in EU-SILC a number of these
components are surveyed at the household level in the HFCS. In order to divide these between
individuals we followed the same process that was developed for the EU-SILC based input database.
The components for which this applies are:
- Rental income from real estate property
- Income from financial investments
- Income from regular social transfers
- Income from other sources
Important to note is that the EUROMOD variable ‘INCOME: other’ (yot) in the EU-SILC refers to
income received by individuals younger than 16 years, while it refers to income received from other
sources in the HFCS.
Variables that could not be created using the HFCS as an underlying dataset and which are used in
EUROMOD - Belgium are firm size (lfs) for the calculation of employer contributions and Belgian
cadastral income of the own residence (khooo) for the calculation of personal income taxes.
10
Disaggregation of social transfers
In the original HFCS dataset all incomes from regular social transfers (except pensions and
unemployment benefits) are covered under one aggregated variable (HG0110), while EUROMOD
requires all types of benefits to be covered separately. As this variable includes income sources
received both at the individual (e.g. educational allowances) and the household level (e.g. housing
allowances, family benefits,…) and are not mutually exclusive their breakdown into separate
components is not straightforward. Child benefits and social assistance for Belgium can be accurately
simulated in EUROMOD and are considered to be the most important, and also the most widespread
among households in the case of child benefits. Therefore, we opted to create a EUROMOD variable
‘BENEFIT: Other’ (bot) which was set equal to the aggregate reported variable and to simulate the
child benefit and social assistance in EUROMOD4, after which these two values are subtracted from
the aggregate variable. As a result of this process we have three output variables: one containing the
simulated child benefits, one containing the simulated social assistance benefits and one including all
other types of benefits covered in HG0110. Where the simulated benefits turn out to be larger than
the reported amounts, we decided first to use the simulated benefits and assume no other benefits
when the difference between simulated and declared benefits is smaller than 150 euros per month.
Second, those households where no social benefits are declared and there are indications of a
reconstituted family, child benefits are set to zero. This is because in Belgium mothers typically
receive child benefits so that if the father and child are part of a new household in the dataset no
child benefits are received by them. Finally, for the remainder of households the difference between
declared and simulated social benefits is relatively large, with a large share of households with
dependent children not declaring any social benefits (about 85% of the remaining households)5.
Therefore, we assume for all remaining households that they have ‘forgotten’ to declare child
benefits. We simply make use of the simulated child benefits, while the declared benefit amount, if
any, is considered to refer to other types of social benefits.
In case of income received from public pensions all types of benefits are also included in one
aggregated variable (PG0310). In order to obtain separate EUROMOD input variables for old age
benefits, survivor benefits and disability benefits the aggregated variable was imputed. As we
assume these three types of benefits to be mutually exclusive, we imputed old age benefits as all
pension income received from the age of 65 onwards and survivor benefits as those pensions
received under the age of 65 by widowed persons. Finally, disability benefits are all those pensions
received by someone who is ‘permanently disabled’ according to one of its declared labour status.
Imputation of main residence mortgages
The HFCS dataset covers very detailed information on mortgages held for the main residence, among
others the monthly payment that is made. However, EUROMOD requires a specification of the part
that is paid in interest and the capital part. First, it should be noted that wealth information in the
HFCS refers to the time of the survey, hence 2010. We assume that mortgage payments are the same
in 2009 as in 2010. Mortgages that were taken or refinanced in 2010 are not included. For
households that refinanced their loan in 2010 we lose the information on the fact that they did have
a mortgage in 2009. However, we do not know the specificities such as the old interest rate of this
4 Similarly to the EU-SILC based simulations, the amounts of social assistance are adjusted for non-take-up of benefits with a
random non take-up correction. 5 Overall, approximately 46.7% of households with dependent children do not report any social benefits.
11
loan. This only involves 8 households. Second, we assume all households to have made a payment
during 12 months. This could be a problem if the mortgage was taken or expired in the income
reference period. Furthermore, we used the following formula to split the mortgage repayment into
an interest and a capital part:
𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 𝑝𝑎𝑟𝑡 = 𝑟𝑒𝑝𝑎𝑦𝑚𝑒𝑛𝑡 ∗ [1 − (1 + 𝑖)(𝑘−𝑛−1)]
Where 𝑖 refers to the interest rate, 𝑛 to the duration of the mortgage and 𝑘 to the time of the
mortgage period that already passed. Subtracting this interest part from the repayment amount
gives the capital part. We have detailed information on duration and interest rate for the first two
mortgages only. For the third mortgage onwards we only have information on the monthly
repayment. We opt to apply the parameters from the second mortgage on the payment for the other
mortgages. In the Belgian sample 16 households have a third mortgage for their main residence.
Missing regional information
Unfortunately the HFCS UDB data do not include information on the region households live in. In the
construction of the trial database we arbitrarily assumed all household to live in Flanders as this
region has the largest population share and this will have the smallest impact on the EUROMOD
outcomes. However, the EUROMOD module simulating the Flemish contribution to care insurance
was not switched on, because all households would then be eligible to pay this specific contribution6.
In other cases the effect on outcomes is probably smaller as they are applicable in all regions but only
differ in level of rates. The lack of NUTS1 information, however, will turn out to be a large problem in
the future, because the sixth state reform involves a substantial transfer of tax-benefit competences
from the federal to the regional level.
5 Simulating net incomes using the Belgian HFCS data
In this part we discuss the outcomes from the EUROMOD process based on the Belgian HFCS data.
Moreover, in order to validate these results we compare them to those obtained by the EU-SILC
database and where possible to other available sources.
We start by analysing how well the HFCS and EU-SILC data represent the Belgian population. Table 3
provides an overview of some basic socio-demographic indicators. Overall, the HFCS and EU-SILC
data appear to represent the Belgian population in a similar way. While the age and gender
distribution are highly similar, the results for highest education achieved and tenure status are found
to diverge slightly more. Relating to the latter, both surveys cover about the same home-ownership
rate, but they differ somewhat in their subdivision on mortgage holdings. Comparing the outcomes
for both surveys with those of external sources, however, indicates that both samples are not
completely representative for the Belgian population. While the HFCS more closely represents the
distribution of highest education achieved than EU-SILC, it appears to underestimate the share of
self-employment. However, in the HFCS respondents can declare several labour statuses. In our
6 As this contribution amounts to maximum 50 euro per year, the overall effect of this omission is probably negligible.
12
definition of labour market status we used only the one that was reported first. It is possible that an
individual declares self-employment as the second, third, etc. labour status at the time of interview
(2010), but has worked in self-employment throughout the main part of the income reference period
(2009) (see discussion of reference period above).
Table 3: Comparison of socio-demographic characteristics between HFCS, EU-SILC and EUROSTAT HFCS 2009 EU-SILC 2009 External 2009
Age 0-15 16-29 30-44 45-64 65+
17.7 17.5 21.3 26.4 17.2
18.2 17.5 21.1 27.0 16.2
18.1 17.4 20.9 27.5 16.1
Gender Female Male
51.0 49.0
50.8 49.2
51.0 49.0
Highest education achieved (*) Not completed primary Primary Lower secondary Upper secondary Post-secondary Tertiary
12.8 11.5 16.0 30.9 N/A 28.8
18.2 12.8 18.0 25.1 1.8
24.1
19.2 (*) 20.2 34.6 N/A 26.1
Labour market status Pre-school Employer or self-employed Employee Pensioner Unemployed Student Inactive Sick or disabled Other Family worker
5.9 3.6
36.4 21.0 6.5
19.8 0.0 2.4 4.2 0.1
7.3 4.1
35.7 18.6 5.1
18.1 1.4 3.0 6.5 0.2
(**) 6.0
33.1 17.7 4.9
22.9 (**) 1.0 2.2
11.6 0.6
Marital status Single Married Separated Divorced Widowed
46.6 40.8 N/A 6.6 6.0
44.6 40.5 0.3 8.8 5.8
43.6 41.7 N/A 8.1 6.6
Tenure status Owned on mortgage Owned outright Rented Reduced rented Social rented Free user
37.4 36.2 24.7 N/A N/A 1.7
41.6 30.2 19.5 7.2 N/A 1.4
69.0 (***)
28.8 (***)
Notes: outcomes are estimated for the EUROMOD sample, not the survey sample; (*) external data only on persons aged 15 years and over, first figure refers to joint category of not completed primary and primary education; (**) includes all children that are entitled to child benefits; (***) figures for 2011, only breakdown into owned versus rented house is
13
available, rest category (2.2%) refers among others to collective residential accommodations which are typically not part of a survey sample Source: own calculations and external sources: age, gender and marital status: EUROSTAT based on CENSUS, education attained: FOD Economics, Department Statistics, labour market status: Data warehouse labour market and social protection of the Crossroads Bank for Social Security, tenure status: CENSUS 2011
Now we move on to the analysis of the distributional outcomes. All figures in this part are computed
for individuals based on their household disposable income equivalised by the OECD modified scale7
and expressed in annual terms, unless mentioned otherwise.
Table 4 shows the outcomes of EUROMOD based on the HFCS in terms of income inequality. We
show outcomes for disposable as well as original income (including pensions), the difference
between them being the inclusion of taxes, social insurance contributions and benefits. The HFCS
median is found to be very similar to the EU-SILC median (see also HFCN (2013, p. 100) for a cross-
country evidence), while mean estimates appear to be somewhat higher based on the HFCS,
especially for original income. Nevertheless, the comparison of inequality indices requires more
attention and further investigations as they show a rather large discrepancy between the outcomes
of the HFCS and EU-SILC. As we will discuss below, this will likely be the consequence of the
oversampling strategy applied in the HFCS. The Kakwani measure for progressivity is shown in Table
5. As expected due to the oversampling strategy, all components of the tax-benefit system are found
to be more progressive in the HFCS compared to the EU-SILC database.
Table 4: Comparison of income inequality indicators between HFCS and EU-SILC EUROMOD 2009
based on HFCS EUROMOD 2009 based on EU-SILC
EU-SILC incomes 2009
Disposable income
Mean 21,995 20,036 21,201 Median 18,977 18,919 19,469 Gini coefficient 0.3172 0.2259 0.2602 Income quintile ratio (S80/S20) 5.00 3.21 3.82
Original income
Mean 29,435 25,247 25,635 Median 22,263 22,638 22,917 Gini coefficient 0.4655 0.3771 0.3792 Income quintile ratio (S80/S20) 17.05 11.08 11.29
Source: own calculations
Table 5: Comparison of progressivity HFCS and EU-SILC
Kakwani index EUROMOD 2009 based
on HFCS EUROMOD 2009 based
on EU-SILC Taxes 0.3737 0.2811
Social insurance contributions 0.2557 0.2176
Social benefits 0.3140 0.2880 Source: own calculations
7 The OECD equivalence scale is constructed by giving the first adult a weight 1, any additional individuals aged 14 years or
over 0.5, while individuals younger than 14 count for 0.3.
14
In Table 6 we show how the impact of the tax-benefit system is distributed across deciles. We find
that the difference in inequality is mainly driven by divergence at the top and the bottom of the
income distribution. While the average equivalised disposable income in the 10th decile is equal to
€58,545 based on the HFCS, it is only €37,580 for EU-SILC. The difference in average disposable
income in the bottom decile is approximately 33% higher in EU-SILC than in the HFCS. Moreover,
differences are mainly found with regard to taxes and social insurance contributions, which are
typically based on the income level, while outcomes for the benefits that are received are much
more similar as eligibility is often based on non-monetary aspects such as the presence of children in
order to qualify for child benefits for instance. This again indicates that the difference in outcomes
between the two surveys can mainly be attributed to the HFCS oversampling strategy.
Table 6: Comparison between HFCS and EU-SILC of averages of different components by decile of equivalised disposable income
Decile
Disposable income
Original income Benefits Taxes Social insurance
contributions
EUROMOD 2009 based on HFCS
1 6,177 2,657 3,700 -26 206 2 11,215 8,466 3,588 201 637 3 13,725 12,884 2,567 751 976 4 15,869 16,683 2,737 1,863 1,689 5 18,021 20,463 2,516 2,933 2,025 6 19,958 23,844 2,817 4,062 2,641 7 22,497 30,071 2,063 6,138 3,499 8 25,110 35,073 2,090 7,859 4,195 9 29,043 44,248 1,315 11,244 5,276
10 58,545 100,382 2,183 34,731 9,289
Total 21,995 29,435 2,559 6,961 3,038
EUROMOD 2009 based on EU-SILC
1 8,235 4,191 4,452 58 350 2 12,089 9,304 3,952 420 747 3 14,210 13,134 3,287 1,149 1,062 4 16,181 16,704 3,225 2,057 1,691 5 18,090 20,184 3,117 3,054 2,156 6 19,887 24,459 2,662 4,368 2,866 7 21,960 28,485 2,418 5,601 3,342 8 24,408 33,583 2,446 7,493 4,128 9 27,744 40,812 2,084 10,217 4,934
10 37,580 61,668 2,298 19,335 7,051
Total 20,036 25,247 2,994 5,373 2,832 Source: own calculations
Table 7 shows a comparison of poverty rates for several poverty thresholds. At each poverty line the
share of poor individuals is higher in the HFCS compared to SILC, although the gap decreases slightly
at higher poverty thresholds. However, the outcomes are closer to the poverty rates based on
reported EU-SILC disposable incomes. It is well-known that disposable income simulated in
EUROMOD differs to an important extent from the reported disposable incomes in the UDB EU-SILC
data (Hufkens et al., 2014). At the official threshold of 60% of median equivalised disposable income
the poverty rate is equal to 15.3% for the HFCS EUROMOD database, 11.7% for the EU-SILC
EUROMOD database and 14.6% for the UDB EU-SILC. As was shown in Table 4 the HFCS and EU-SILC
15
medians are similar. Hence, the difference in poverty rates can hardly be attributed to a difference in
poverty thresholds. However, Table 6 showed that HFCS figures of net disposable incomes are lower
at the bottom of the distribution and higher at the top of the distribution compared to the EU-SILC
outcomes. This directly impacts the number of individuals below 60% of median income.
If we look at the distribution of poverty across age categories in Table 8, it appears that the
overrepresentation of poor individuals in the HFCS compared to EU-SILC is not evenly distributed by
age. While the number of poor individuals is relatively similar for the two surveys in the age
categories of 0-15 and 65+, the poverty rates in the other three age categories are much larger,
especially for individuals aged between 16 and 29.
Table 7: Comparison of poverty rates at different poverty lines between HFCS and EU-SILC Percentage of individuals below: EUROMOD 2009
based on HFCS EUROMOD 2009 based on EU-SILC
EUROSTAT (EU-SILC)
40% of median equivalised disposable income
5.4% 2.3% 4.1%
50% of median equivalised disposable income
9.5% 5.5% 7.9%
60% of median equivalised disposable income
15.3% 11.7% 14.6%
70% of median equivalised disposable income
22.6% 19.8% 23.8%
Source: Hufkens et al. (2014) and own calculations
Table 8: Comparison of poverty rates by age group between HFCS and EU-SILC Age group: EUROMOD 2009 based
on HFCS EUROMOD 2009 based
on EU-SILC EU-SILC incomes 2009
0-15 14.8% 14.8% 18.5% 16-29 20.8% 11.1% 14.3% 30-44 15.7% 9.4% 11.4% 45-64 13.9% 9.5% 11.6% 65+ 12.2% 15.3% 19.5%
Note: poverty line is set at 60% of median equivalised income Source: Hufkens et al. (2014) and own calculations
In short, this preliminary validation exercise of a EUROMOD database constructed on the HFCS
indicates that outcomes based on simulated disposable incomes are reasonable. This is in line with
Kuypers et al. (2015), which include a similar validation exercise for gross incomes comparing HFCS
Belgium with the EU-SILC and the SHARE database. There are, however, some remarkable differences
which warrant further investigation. The largest discrepancies are found with regard to the level of
inequality, which is found to be largely driven by divergences at the top of the distribution, which in
turn is assumed to be the consequence of the HFCS oversampling strategy. Kennickell (2008) and
Bover (2008) argue that on top of its correction for nonresponse oversampling of the wealthy also
provides more precise estimates of wealth in general and of narrowly held assets as standard errors
are much smaller. Since the income and wealth distributions are highly correlated, especially at the
top (e.g. Alvaredo et al., 2013), oversampling will also result in more accurate estimates of the top of
the income distribution as well as of income sources that are typically received by a select group.
Therefore, we expect the HFCS to capture the level of inequality more closely to reality than EU-SILC.
Vermeulen (2014), however, shows that despite the oversampling strategy wealth shares of the top 5
16
and 1% are still underestimated. It is not clear whether this is also the case for the income
distribution.
Some particular aspects should be borne in mind in the use of the HFCS-EUROMOD database and the
interpretation of its outcomes. First, the HFCS sample is considerably smaller than the EU-SILC
sample. Therefore one should be careful in interpreting results for small subgroups. Second, an
analysis of some socio-demographic characteristics indicated that the sample is not fully
representative for the Belgian population. Most importantly the HFCS might slightly underestimate
the share of self-employment as main labour status. The largest limitation of the HFCS, however, is
the fact that the income reference period and the reference time of other aspects does not coincide.
Moreover, the reference period also differs between separate countries, which will complicate cross-
country analyses.
6 Conclusion
This paper explores the feasibility of considering the HFCS data as an underlying database for the
European tax-benefit model EUROMOD. We created a trial database for Belgium and validated some
aggregate results by comparing outcomes to those obtained when EU-SILC is used as underlying
database as well as to external databases. These first results indicate that it is feasible to use the
HFCS database as EUROMOD input data, despite some of the outcomes need further investigation.
The main differences exist with regard to the level of inequality, which is found to be largely driven
by divergences at the top of the distribution, which in turn is assumed to be the consequence of the
HFCS oversampling strategy. As our discussion above indicated, the oversampling of wealthy
households might result in more accurate estimates of income and wealth at the top. Another
conclusion from our research is that a comparison of results between EU-SILC and the HFCS cannot
be based just on medians alone. It is important to look at the distribution, as our outcomes show that
there are some discrepancies at especially the bottom and the top of the distribution. The reasons
for these discrepancies should be investigated in more depth.
Hence, our preliminary conclusion is that, although transforming the HFCS into a database for
EUROMOD would require a significant amount of effort and the simulation results require a detailed
scrutiny to assess their reliability against external statistics and results based different input data,
this is surely to be worthwhile because of the interesting possibilities to extend the policy scope of
EUROMOD and also to consider the joint distribution of disposable income, wealth and consumption.
In a future extension of this paper a second trial database for Italy will be constructed. Since the
HFCS data for Italy originate from the conversion of an existing national survey (i.e. Survey on
Household Income and Wealth (SHIW)) the strengths and weaknesses of these data are well known.
Moreover, much more variables are available for Italy, such as imputed rent and net incomes for
instance, which will largely contribute to the validation of using the HFCS as an underlying database
for tax-benefit microsimulation in EUROMOD.
17
7 References
Alvaredo, F., Atkinson, A. B., Piketty, T., & Saez, E. (2013). The top 1 percent in international and
historical perspective. Journal of Economic Perspectives, 27(3), 3-20.
Bover, O. (2008). Oversampling of the wealthy in the Spanish Survey of Household Finances (EFF).
Irving Fisher Committee Bulletin, 28, pp. 399-402.
Davies, J. B. (2009). Wealth and economic inequality. In W. Salverda, B. Nolan, & T. M. Smeeding, The
Oxford Handbook of economic inequality (pp. 127-149). Oxford: Oxford University Press.
Davies, J. B., Sandström, S., Shorrocks, A., & Wolff, E. N. (2011). The level and distribution of global
household wealth. The Economic Journal, 121(551), 223-254.
Eurosystem Household Finance and Consumption Network. (2013a). The Eurosystem Household
Finance and Consumption Survey - Methodological report for the first wave. ECB Statistics
Paper No1, 112p.
Eurosystem Household Finance and Consumption Network. (2013b). The Eurosystem Household
Finance and Consumption Survey - Results from the first wave. ECB Statistics Paper No2,
112p.
Figari, F. (2013). Should we make the richest pay to meet fiscal adjustment needs? - Discussion. The
role of tax policy in times of fiscal consolidation (pp. 103-107). European Economy, Economic
Papers 502.
Figari, F., Levy, H., & Sutherland, H. (2013). Using the EU-SILC for policy simulation: Prospects, some
limitations and some suggestions. Comparative EU Statistics on Income and Living Conditions:
Issues and challenges (pp. 345-373). Eurostat Methodologies and Working Papers, European
Communities.
Figari, F., Paulus, A., & Sutherland, H. (2015). Microsimulation and policy analysis. In A. B. Atkinson, &
F. Bourguignon, Handbook of Income Distribution Volume 2B. Amsterdam: Elsevier-North
Holland.
Hills, J. (2013). Safeguarding social equity during fiscal consolidation: which tax bases to use? The role
of tax policy in times of fiscal consolidation (pp. 80-91). European Economy, Economic Papers
502.
Hufkens, T., Spiritus, K., & Vanhille, J. (2014). EUROMOD Country Report Belgium 2009-2013.
Jäntti, M., Sierminska, E., & Van Kerm, P. (2013). The joint distribution of income and wealth. In J. C.
Gornick, & M. Jäntti, Income inequality. Economic disparities and the middle class in affluent
countries (pp. 312-333). Stanford: Stanford University Press.
Kennickell, A. B. (2008). The role of oversampling of the wealthy in the Survey of Consumer Finances.
Irving Fisher Committee Bulletin, 28, pp. 403-408.
18
Kuypers, S., Marx, I., & Verbist, G. (2015). Joint patterns of income and wealth inequality in Belgium.
Unpublished manuscript.
Piketty, T. (2011). On the long-run evolution of inheritance: France 1820-2050. The Quarterly Journal
of Economics, 126(3), 1071-1131.
Piketty, T. (2013). Should we make the richest pay to meet fiscal adjustment needs? The role of tax
policy in times of fiscal consolidation (pp. 99-102). European Economy, Economic Papers 502.
Piketty, T. (2014). Capital in the Twenty-First Century. Harvard, USA: Harvard University Press.
Piketty, T., & Saez, E. (2013). Top incomes and the Great Recession: Recent evolutions and policy
implications. IMF Economic Review, 61, 456-478.
Piketty, T., & Zucman, G. (2014). Capital is back: Wealth-income ratios in rich countries, 1700-2010.
The Quarterly Journal of Economics, 129(3), 1255-1310.
Stiglitz, J. E., Sen, A., & Fitoussi, J.-P. (2011). Report by the Commission on the Measurement of
Economic Performance and Social Progress.
Sutherland, H., & Figari, F. (2013). EUROMOD: the European Union tax-benefit microsimulation
model. International Journal of Microsimulation, 6(1), 4-26.
Tiefensee, A., & Grabka, M. M. (2014). Comparing wealth - Data quality of the HFCS. DIW Berlin
Discussion Paper No 1427.
Vermeulen, P. (2014). How fat is the top tail of the wealth distribution? ECB Working Paper No1692.