data-driven robust credit portfolio optimization for

11
Research Article Data-Driven Robust Credit Portfolio Optimization for Investment Decisions in P2P Lending Guotai Chi, Shijie Ding , and Xiankun Peng Faculty of Management and Economics, Dalian University of Technology, Dalian , China Correspondence should be addressed to Shijie Ding; [email protected] Received 24 October 2018; Accepted 24 December 2018; Published 2 January 2019 Academic Editor: Emilio G´ omez-D´ eniz Copyright © 2019 Guotai Chi et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Peer-to-Peer (P2P) lending has attracted increasing attention recently. As an emerging micro-finance platform, P2P lending plays roles in removing intermediaries, reducing transaction costs, and increasing the benefits of both borrowers and lenders. However, for the P2P lending investment, there are two major challenges, the deficiency of loans’ historical observations about the certain borrower and the ambiguity problem of estimated loans’ distribution. In order to solve the difficulties, this paper proposes a data-driven robust model of portfolio optimization with relative entropy constraints based on an “instance-based” credit risk assessment framework. e model exploits a nonparametric kernel approach to estimate P2P loans’ expected return and risk under the condition that the historical data of the same borrower is unavailable. Furthermore, we construct a robust mean–variance optimization problem based on relative entropy method for P2P loan investment decision. Using the real-world dataset from a notable P2P lending platform, Prosper, we validate the proposed model. Empirical results reveal that our model provides better investment performances than the existing model. 1. Introduction Peer-to-peer lending, as an emerging online micro-finance, provides services that bring borrowers and lenders together virtually and help them to lend to and borrow from each other directly. P2P lending platforms play roles in removing tra- ditional financial intermediaries, reducing transaction costs, and increasing the benefits of both borrowers and lenders; therefore, they improve the efficiency of financial market. However, due to the absence of traditional financial inter- mediaries which can use collateral, certified accounts, and other means to enhance the creditworthiness of borrowers, the information asymmetry between borrowers and lenders severely exist and the credit risk of P2P loan investment is very high. Credit risk of P2P lending refers to the potential monetary loss arising from the default of a borrower to a loan. Efficient and reasonable investment in P2P loans needs to be based on the reliable credit risk distribution assessment. It is very challenging to estimate the credit risk distribution of P2P loans for the difficulty of obtaining the historical returns (or losses) data of the loan waiting for investment. In other words, the historical yield data about the same borrower is usually unavailable. Moreover, even the distribution of loans’ returns (or losses) is approximated from the limited available data or the expert knowledge, the approximation is usually not accurate, and it is also known as the distribution ambiguity (probability measure uncertainty) problem. In this paper, we formulate a data-driven robust portfolio optimization model based on an “instance-based” credit risk assessment method for investment decisions in P2P lending. To help personal lenders mitigate the risk, the current online P2P lending platforms have taken some risk-reducing measures, such as filtering out the high-risk borrower whose FICO score is lower than a threshold, making a preliminary rating on each loan and providing investors with risk level of each loan. us, each loan is marked as a grade, like AA, A, B, C, D, E, or NR, and the loans with the same grade are considered to have the same risk level. ese rating-based models are more suitable for traditional banks and lending institutions, since they have the capability to grant large amounts of loans to diversify their investments. However, the Hindawi Mathematical Problems in Engineering Volume 2019, Article ID 1902970, 10 pages https://doi.org/10.1155/2019/1902970

Upload: others

Post on 23-Feb-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Research ArticleData-Driven Robust Credit Portfolio Optimization forInvestment Decisions in P2P Lending

Guotai Chi Shijie Ding and Xiankun Peng

Faculty of Management and Economics Dalian University of Technology Dalian 116024 China

Correspondence should be addressed to Shijie Ding ding0601126com

Received 24 October 2018 Accepted 24 December 2018 Published 2 January 2019

Academic Editor Emilio Gomez-Deniz

Copyright copy 2019 Guotai Chi et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Peer-to-Peer (P2P) lending has attracted increasing attention recently As an emerging micro-finance platform P2P lending playsroles in removing intermediaries reducing transaction costs and increasing the benefits of both borrowers and lenders Howeverfor the P2P lending investment there are two major challenges the deficiency of loansrsquo historical observations about the certainborrower and the ambiguity problem of estimated loansrsquo distribution In order to solve the difficulties this paper proposes adata-driven robust model of portfolio optimization with relative entropy constraints based on an ldquoinstance-basedrdquo credit riskassessment frameworkThemodel exploits a nonparametric kernel approach to estimate P2P loansrsquo expected return and risk underthe condition that the historical data of the same borrower is unavailable Furthermore we construct a robust meanndashvarianceoptimization problem based on relative entropy method for P2P loan investment decision Using the real-world dataset from anotable P2P lending platform Prosper we validate the proposed model Empirical results reveal that our model provides betterinvestment performances than the existing model

1 Introduction

Peer-to-peer lending as an emerging online micro-financeprovides services that bring borrowers and lenders togethervirtually and help them to lend to and borrow fromeach otherdirectly P2P lending platforms play roles in removing tra-ditional financial intermediaries reducing transaction costsand increasing the benefits of both borrowers and lenderstherefore they improve the efficiency of financial marketHowever due to the absence of traditional financial inter-mediaries which can use collateral certified accounts andother means to enhance the creditworthiness of borrowersthe information asymmetry between borrowers and lendersseverely exist and the credit risk of P2P loan investment isvery high

Credit risk of P2P lending refers to the potentialmonetaryloss arising from the default of a borrower to a loan Efficientand reasonable investment in P2P loans needs to be basedon the reliable credit risk distribution assessment It is verychallenging to estimate the credit risk distribution of P2Ploans for the difficulty of obtaining the historical returns (or

losses) data of the loanwaiting for investment In otherwordsthe historical yield data about the same borrower is usuallyunavailable Moreover even the distribution of loansrsquo returns(or losses) is approximated from the limited available dataor the expert knowledge the approximation is usually notaccurate and it is also known as the distribution ambiguity(probability measure uncertainty) problem In this paper weformulate a data-driven robust portfolio optimization modelbased on an ldquoinstance-basedrdquo credit risk assessment methodfor investment decisions in P2P lending

To help personal lenders mitigate the risk the currentonline P2P lending platforms have taken some risk-reducingmeasures such as filtering out the high-risk borrower whoseFICO score is lower than a threshold making a preliminaryrating on each loan and providing investors with risk levelof each loan Thus each loan is marked as a grade like AAA B C D E or NR and the loans with the same grade areconsidered to have the same risk level These rating-basedmodels are more suitable for traditional banks and lendinginstitutions since they have the capability to grant largeamounts of loans to diversify their investments However the

HindawiMathematical Problems in EngineeringVolume 2019 Article ID 1902970 10 pageshttpsdoiorg10115520191902970

2 Mathematical Problems in Engineering

individual investors just possess small amount of funds theyneed more refined risk assessment methods and investmentstrategies

Similar to bond investment P2P investors can fund aportion not the whole of each loan Therefore investors candecide which loans to invest and meanwhile determine theamount of investment for each loan This mechanism allowsinvestors to construct a credit portfolio to mitigate risk

Markowitz [1] proposes the famous mean-variancemodel which is still widely used in portfolio selection andrisk management From then on researchers propose avariety of mean-risk models such as mean-downside riskmodel [2] mean-VaR model [3] mean-CVaR model [4] andso on In practice the distribution of the assets needs tobe estimated firstly and then the optimal portfolio can beidentified by the optimization model

For P2P lending investment as mentioned above suchprocedures face at least two major challenges ie the defi-ciency of loansrsquo historical observations and the ambiguityproblem of estimated loansrsquo distribution (probabilitymeasureuncertainty problem) Thus this paper proposes a data-driven robust model of portfolio optimization based onrelative entropy constraints combined with an instance-basedcredit risk assessment method

Specifically we use the ldquoinstance-basedrdquo credit riskassessment method proposed by Guo et al [5] to evaluate thereturn and risk of each loan without sufficient historical dataof loans for each individual borrower In this instance-basedframework the expected return of each loan is predictedas a weighted average of historical loans of other similarborrowers where the optimal weights are learnt based onkernel regression Furthermore using the moment informa-tion (mean and variance) of the new loans we formulatethe robust portfolio optimization model with relative entropyconstraints which could obtain an optimal portfolio undertheworst scenario and has the ability of reducing the potentialloss caused by the uncertainty of loans distribution

Our work is somewhat related to the paper by Guo et al[5] and the paper by Yam et al [6] Guo et al [5] introducethe instance-based framework into credit risk assessmentof P2P loan and use the classical mean-variance model toobtain the optimal allocation Yam et al [6] derive a robustmean-variance optimization model with relative entropyconstrains on the uncertainty of the interaction between thereturns of different assets and discuss its mathematical andfinancial properties in portfolio selection Although someother scholars have contributed novel insights into creditrisk assessment of P2P lending and robust optimizationto the best of our knowledge few have taken both intoconsideration synthetically The main contribution of thispaper is that we propose a data-driven robust portfoliooptimization model based on relative entropy constraintscombined with instance-based risk assessment frameworkfor P2P loan investment and obtain superior performance innumerical experiments

The rest of this paper is organized as follows Section 2provides the literature review Section 3 introduces theinstance-based model for credit risk assessment as well as themathematical framework of kernel regression approach In

Section 4 we elaborate the robust optimization model basedon relative entropy method and formulate a robust mean-variance optimizationmodel for P2P lending investmentTheempirical results on the effectiveness of ourmodel is reportedin Section 5 Finally Section 6 concludes this work

2 Literature Review

In order to assess risk and assist investment decisions makingin P2P lending researchers have donemany studies Emekteret al [7] explore the dominated factors that explain thefunding success and credit risk and meanwhile measure theperformance of P2P loans They find that credit grade debt-to-income ratio FICO score and revolving line utilizationplay an important role in loan defaults furthermore loanswith lower credit grade and longer duration may result inhigh mortality rate and higher interest rates charged on lowcredit grade borrowers are not sufficient to cover the potentialloss for the higher likelihood of loan defaults Thus theauthors suggest that investors should invest more to highgrade loans Similarly Berkovichrsquos [8] study finds that highquality loans offer excess return

The above researches investigate the factors determiningthe credit risk and analyze the performance of P2P loanshowever they do not propose a mechanism which assistindividual investors in allocating loans effectively andmakingoptimal investment decisions

To help personal lenders mitigate the risk the popularonline P2P platforms like Lending Club and Prosper havedeveloped credit scoring systems to assess the creditworthi-ness of each borrower based on data mining or machinelearning techniques There is a large body of existing lit-eratures concerned with credit rating using data miningtechniques for example linear discriminate analysis (LDA)[9] k-nearest neighbors [10] logistic regression [11] classifi-cation and regression trees (CART) [12] Markov chains [13]survival analysis [14] artificial neural network (ANN) [15]genetic methods [16] support vector machine (SVM) [17 18]lasso-probit [19] and so on

In the portfolio selection problem full knowledge ofthe assetsrsquo distribution is usually assumed to determine theoptimal portfolio In most real-life applications we need toapproximate the assetsrsquo distribution However the approx-imations are not necessarily accurate and it is known asthe distribution ambiguity (probability measure uncertainty)problem

The robust optimization algorithm is an attractive wayto solve the portfolio selection problem under distributionambiguity As the exact parameters are unavailable Natarajanet al [20] use a set of parameters (which represent differentdistributions or scenarios) rather than a point estimationof the parameters to formulate the asset allocation prob-lem Following this idea there are different ways to modelambiguity by using a set of parameters Chen et al [21] takethe lower partial moments and CVaR as two risk measuresand consider a tight bound which are likely to cover thepossible parameters Epstein [22] considered intervals thatmay include the actual parameters Natarajan et al [23]use a piecewise-linear concave utility function to derive

Mathematical Problems in Engineering 3

accurate and estimated optimal strategies for the expectedutility model in the portfolio optimization issue under theworst-case scenarios Pac and Pinar [24] use an ellipsoidaluncertainty set to represent the distribution ambiguity toidentify the optimal portfolio

Since relative entropy has the ability to measure thedifference between two probability distributions (probabilitymeasures) it can be used to construct the uncertainty set forrobust optimization In the studies of Hansen and Sargent[25] and Calafiore [26] relative entropy is used to modeluncertainty and obtain the optimal investment decisionYam et al [6] derive a robust mean-variance optimizationmodel with relative entropy constrains on the uncertainty ofthe interaction between the returns of different assets anddiscuss its mathematical and financial properties in portfolioselection

In recent years research ondata-drivenmethods has beenwell studied In this framework it is assumed that investorsonly possess the information about history data of assetreturn Bertsimas et al [27] use KS test 1205942 test Anderson-Darling test and some other testing tools to construct uncer-tainty sets and take the worst case of each set to formulate therobust optimization They assume that the uncertainty setsare defined by certain structures and sizes based on the datapoints available While the structure of uncertainty set in ourstudy is not predefined we consider the uncertainty of meancovariance and distribution synthetically Kang et al [28]propose a data-driven robust mean-CVaR portfolio selectionmodel under the condition of distribution ambiguity andadopt a nonparametric bootstrap approach to calibrate thelevels of ambiguity Their work is based on the mean-CVaRframeworkwith data of stock indices while our work is basedon the mean-variance framework with data of P2P loans

3 Instance-Based Model forCredit Risk Assessment

Using historical data to evaluate future performance andpotential loss is a convention However unlike bonds orstocks investment the historical yield data about the sameP2P borrower is usually unavailableThus the risk assessmentof new loan is very challenging In this section we brieflyintroduce the instance-based credit risk assessment modelproposed by Guo et al [5]

31 Instance-Based Assessment Framework In this instance-based assessment framework the expected return of eachloan is estimated as a weighted average of historical observa-tions of other borrowersrsquo closed loans Specifically for a newloan i using n past loans each with an historical return 119877119895 (j= 1 2 n) we can calculate the expected retrun of loan i 120583119894based on a weighted average of past loansrsquo actual returns

120583119894 =119899sum119895=1

119908119894119895119877119895 (1)

where 119908119894119895 denotes the weight of loan j for predicting theexpected retrun of loan i The weight depends on thesimilarity between loan i and loan j Intuitively the more

the similarity the greater the weight The calculation of theweight will be introduced in Section 32

The weighted returns of the past loans are assumed ashistorical observations of a new loan According to this lineof thought taking variance as the risk measure weightedvariance of past loans are used to assess the new loanrsquos riskthat is

1205902119894 =119899sum119895=1

119908119894119895 (119877119895 minus 120583119894)2 (2)

where119908119894119895 119877119895 and 120583119894 have the same meanings as (1)The absolute deviation between two loansrsquo default prob-

abilities is used to measure the similarity the smaller theabsolute deviation themore the similarity and therefore thelarger the weight In particular absolute deviation of defaultprobabilities between loans i and j is defined as follows dij= |pi - pj| where pi and pj are the default probabilities ofloans i and j respectively Kernel regression is exploited toinvestigate the nonlinear relationship between the absolutedeviation and the weight This process will be introduced inthe next subsection

32 Kernel Regression of Return and Risk Kernel regressionis a nonparameter statistical method to investigate the non-linear relation between random variables which is based onthe kernel density estimation First of all the preliminaries ofkernel estimation are introduced

Given n realizations zj j = 1 n of random variable zthe kernel estimation 119901(119911) of the probability density functionp(z) is defined by

119901 (119911) = 1119899ℎ119899sum119895=1

119870(119911119895 minus 119911ℎ ) (3)

where K(sdot) is a kernel function and h is a smoothingparameter

Kernel function K(sdot) is nonnegative and bounded andmeanwhile satisfies the following properties

(a) intinfinminusinfin119870(119911)119889119911 = 1 (b) intinfin

minusinfin119911119870(119911)119889119911 = 0 (c)

intinfinminusinfin1199112119870(119911)119889119911 lt infinThere are a range of commonly used kernel functions

such as uniform triangular biweight triweight andGaussian[29] Because the kernel estimation is insensitive to the choiceof kernel function we use the Gaussian kernel function dueto its convenient mathematical properties which is written as119870(119911) = (1radic2120587)119890minus11991122

The smoothing parameter h=h(n) is also called thebandwidth that depends on the sample size n Specificallyh(n) and nsdoth(n) decrease to 0 as n tend toinfin

Many literatures reveal that the choice of kernel func-tion does not affect the estimation significantly howeverthe choice of the bandwidth is a vital issue [30 31] Thedetermination of the bandwidth will be shown in detail inSection 53

In the following we introduce the kernel regressionmodel proposed by Nadaraya [32] Theoretically we assumethat each observation is denoted as (X Y) which is a random

4 Mathematical Problems in Engineering

vector R2-valued With the sample set (xj yj)| j = 1 2119899 the kernel estimator 119910 of the target y given its predictiveobservation x is defined as

119910 = 119899sum119895=1

[[

119870((119909 minus 119909119895) ℎ)sum119899119895=1119870((119909 minus 119909119895) ℎ) sdot 119910119895

]] (4)

where K(sdot) is a kernel function and h is the bandwidthFor the instance-based credit risk modeling the set of

historical observations is represented as (pj Rj)| j = 1 2119899 where pj and Rj are the default probability and return rateof the jth loan respectively Thereby the estimation of the ithloanrsquos return could be written as

120583119894 =119899sum119895=1

[[

119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) sdot 119877119895

]] (5)

Note that the determination of loansrsquo default probability willbe introduced in Section 51

Comparing (1) to (5) we can represent the optimal weight119908119894119895 as

119908119894119895 = 119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) (6)

Using the optimal weight 119908119894119895 and the expected return 120583119894derived from (5) (2) can be rewritten as

2119894 =119899sum119895=1

[[

119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) sdot (119877119895 minus 120583119894)

2]] (7)

4 Robust Investment Decision Model

Similar to bond investment P2P lenders can invest a portionof each loan Thus P2P loan investment decisions can betransformed into a credit portfolio optimization problemThis section introduces the portfolio optimization model forinvestment decisions in P2P lending which accounts for theuncertainty of the distribution of the loans We start fromthe classical mean-variance optimization model proposed byMarkowitz [1] to its tractable robust counterpart

41 Robust Optimization Model Based on Relative EntropyConstraints In the classical mean-variance optimizationmodel the optimal asset allocation strategy is identified bysolving the tradeoff between risk and return according toinvestorsrsquo risk preference A portfolio that invests in n assets isrepresented as a vector of weights 120582 isin Rn where each weightdenotes the proportion of wealth allocated to an asset Thenthe return and risk of the portfolio become 120582T120583 and 120582T119881120582respectively where 120583 isin Rn and V isin Rntimesn are the expectedreturn and the covariance matrix of the assetsrsquo returnsunder the probability measure (or probability distribution)P respectively Here P represents the ideal estimated marketcondition where 120583 and V estimated by using all availableinformation including historical observations news expert

knowledge and so on are assumed as the actual expectedreturn and covariance matrix Thus the classical mean-variance portfolio selection problem (MV) can be formulatedas

(MV) min120582

120582T119881120582st 120582T120583 ge 119877lowast

120582 isin Ω(8)

whereΩ sube Rn denotes the set of feasible portfolios and 119877lowast isthe required return rate specified by the investor

In reality the assumption that the expected return 120583and covariance matrix V are known with certainty is lessreasonable It is quite possible that the estimated parametersare different with the actual ones Thus the optimal portfolioidentified by using the estimated inputs parameters 120583 andV directly may be inappropriate Robust optimization seeksfor portfolios that are insensitive to the uncertain in theparameters and the solutions that must be feasible no matterwhat the actual value of the parameters is

The investors might consider a set of probability mea-sures ie an uncertainty set to cover a range of scenariosbased on their assessments and then use robust optimizationto obtain approximate optimal strategies for the worst sce-narios within the uncertainty set In this paper we define Qas the set of probability measures representing the possiblescenarios 120583119876 and 119881119876 as the expected return and covariancematrix estimated under the probability measure 119876 isin QMathematically the robust counterpart of the classical mean-variance optimization problem (RMV) can be written as

(RMV) min120582

sup119876isinQ

120582T119881119876120582st inf

119876isinQ120582T120583119876 ge 119877lowast

120582 isin Ω(9)

It is rational to assume that the actual value of the parametersis in the neighborhood of the estimatorThus we can generatethe uncertainty set Q based on the assumption that themeasures in the set should be not far from the ideal measureP Relative entropy also known as the KullbackndashLeiblerdivergence can be used to measure the difference betweenprobability measures The relative entropy of the measure 119876in Q with respect to the measure P is

119863119870119871 (119876 119875) fl int119902 (119909) ln 119902 (119909)119901 (119909)119889119909 (10)

where 119901(119909) and 119902(119909) are the probability density functions(pdf) of the loansrsquo returns under probability measures P and119876 respectively In the context of mean-variance analysisrelative entropy 119863119870119871(119876 119875) can be rewritten as

119863119870119871 (119876 119875) = 12 [ln |119881| minus ln 10038161003816100381610038161198811198761003816100381610038161003816 + tr (119881minus1119881119876) minus 119899+ (120583 minus 120583119876)T119881minus1 (120583 minus 120583119876)]

(11)

Mathematical Problems in Engineering 5

where 120583 V 120583119876 and 119881119876 carry the same meaning as in (8) and(9) tr(V) |119881| and V be the trace the determinant and thetranspose of V respectively n is the amount of assets in theportfolio

Let U denote the set of parameters (120583119876 119881119876) under themeasure Q in Q Using the constraint of relative entropy wecan rewrite the robust optimization model (9) as

(RMV-RE) min120582

max(120583119876119881119876)isinU

120582T119881119876120582st min

(120583119876119881119876)isinU120582T120583119876 ge 119877lowast

119863119870119871 (119876 119875) le 119870120582 isin Ω

(12)

where K is a positive constant and determines the size ofuncertainty set Parameter K measures the level of uncer-tainty and reflects the investorsrsquo confidence in 120583 and Vestimated under probability measure P ie the greater Krsquosvalue the less confidence

Yam et al [6] prove that the robustmean-variance portfo-lio selection model based on relative entropy method (RMV-RE) can be formulated as quadratic optimization problemwhich is a tractable formulation and can be efficiently solvedThat is

min120582isinR119899

120582T119881 lowast 120582st 120582T120583lowast ge 119877lowast

120582 isin Ω(13)

Herein 120583lowast=120577120583 Vlowast=V+120577(1-120577)120583120583T and 120577 isin (0 1] is relatedto K in (12) closely which reflects the level of confidencein 120583 and V estimated under measure P For example 120577=1means that investors believe the estimated 120583 and V are thetrue parameters And as 120577 decreases the investorrsquos confidenceis weaker The details of the proof are referred to by Yam et al[6]

42 Robust Mean-Variance Portfolio Optimization Model inP2P Lending In the Section 32 we estimated each loanrsquosexpected return and variance of return ie 120583119894 and 120590119894 usingthe instance-based credit risk assessment model Let 120583 =(1205831 1205832 120583119899)T and

=[[[[[[[[[

1 0 00 2 d

d d 00 0 120590119899

]]]]]]]]]

(14)

denote the expected return vector and the covariance matrixof the loansrsquo returns under the probability measure P Herewe assume that the correlation between P2P loans is negligi-ble Now we can rewrite (13) as

Table 1 Description of variables

Variable DescriptionX1 FICO score of the borrower

X2The number of inquiries of the borrower in the last 6

monthsX3 Themonetary amount of the loan

X4The homeownership status of the borrower (0 = rent 1

= own)X5 The debt-to-income ratio of the borrowerX6 The number of accounts delinquentX7 The number of public records in the past 10 yearsY Dependent variable (0 = completed 1 = default)

min120582isinR119899

120582T ( + 120577 (1 minus 120577) 120583120583119879) 120582st 120582T (120577120583) ge 119877lowast

120582 isin Ω(15)

The feasible region Ω of our problem is defined by thefollowing constraints

(1) The value of the portfolio remains at its initial valueiesum119894 120582119894 = 1

(2) Short-selling is forbidden thus 120582119894 ge 0(3) For each loan the amount that lender can invest is

no more than the borrower request mi thereby 120582119894Mle mi where M is the total investment amount andinvestor has available

5 Empirical Analysis

In this section we investigate the validity of the robustmean-variance portfolio optimization model in P2P lending usingthe real-world dataset from a notable P2P lending platformProsper All numerical experiments are performed by usingMATLAB on PC

51 Data Description and Preprocess The dataset for empir-ical study is from a notable P2P lending platform in theUnited States Prosper It consists of 17001 loans including3039 default loans and 13908 completed loans whose issuedates within the period from November 2005 to March 2014

Using the data a credit scoring model is learnt to trans-form the loan attributes into the default probability The loanattributes are as follows the borrowerrsquos FICO score whichreflects borrowerrsquos creditworthiness the borrowerrsquos numberof inquiries in the past six months the monetary amountof the loan the homeownership status of the borrowerthe debt-to-income ratio of the borrower the borrowerrsquoscurrent delinquencies representing the number of accountsdelinquent and the borrowerrsquos number of public records inthe past 10 years (Row 1-7 in Table 1) The target variable isa binary variable (0 represents completed and 1 representsdefault) as described in Row 8 of Table 1

6 Mathematical Problems in Engineering

009500955

009600965

009700975

009800985

009900995

01

CV (h

)002 004 006 008 01 012 014 016 018 020

h

Figure 1 The curve of CV (h)

There exist many credit scoring models to predict thedefault probability of a loan such as Xgboost model [33ndash35] hybrid KMVmodel [36] credit scoring based on geneticalgorithms [37 38] and so on However discussing how tochoose and construct the optimal credit scoring model isbeyond the scope of this study and we use the most popularmodel logistic regression to make the prediction in thispreprocessing step

We randomly divide the dataset into two parts onecontaining 40 of all loans for determining the optimalbandwidth h in (5) which will be described in detail inSection 53 and the second part containing 60 of the loansMoreover using k-fold cross-validation we randomly dividethe second part into 20 subsets each of which containsapproximately 510 loans In each round one of the subsetsis used as the testing set which consists of loans waiting tobe invested thus their pay-back statuses are unknown andall other subsets are taken as a training set which consists ofhistorical loans with known yield

52 Model Description In this paper we propose a robustcredit portfolio optimization model for investment decisionsin P2P lending In order to show its effectiveness we compareit with a benchmark model proposed by Guo et al [5] In thefollowing we describe models in detail

IOM is the instance-based model proposed by Guo etal [5] Each loan is assessed using kernel weights and thehistorical performance of similar loans Then use the classicalmean-variance model (8) to identify the optimal allocationstrategy The performance of this model outperforms somerating-based models as the results of Guo et al [5] show

RIOM is the robust instance-based model in this studyExpected return and risk of each loan are also assessed basedon the ldquoinstance-basedrdquo assessment framework However weuse the robust model of credit portfolio optimization basedon relative entropy method Equation (15) to obtain theoptimal investment decision

We compare the two models by the following procedure(1) Train the credit risk assessment model with the

training set and use the trained model to predict theexpected return (120583119894) and variance (120590119894) of each loan inthe testing set Thus the expected return vector andthe covariance matrix 120583 and V can be obtained

(2) For each model feed the predicted expected returnvector 120583 and the covariance matrix 119881 of the testingloans into the portfolio optimization algorithm andcompute the performance of investment on the opti-mal portfolio

(3) Compare the return rate of the two models

53 Analysis of Results As mentioned before we select theGaussian kernel 119870(120577) = (1radic2120587)119890minus12057722 as the kernel func-tion And the important parameter in the kernel regressionmodel bandwidth h is optimized by the following leave-one-out cross validation

ℎ119900119901119905119894119898119886119897 = argminℎ119862119881 (ℎ)

= argminℎ

119899sum119894=1

(120583ℎ (119901minus119894) minus 120583119894)2 (16)

where 120583ℎ(119901minus119894) is the leave-one-out estimation of expectedreturn rate 120583119894 specifically

120583ℎ (119901minus119894) =119899sum119895=1119895 =119894

[[

119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119895 =119894119870((119901119894 minus 119901119895) ℎ) sdot 119877119895

]] (17)

The curve of CV(h) is exhibited in Figure 1 The shape of thecurve clearly shows a minimal point and h corresponding tothe minimal point is the optimal bandwith for the model

To apply the robust credit portfolio optimization methodto obtain the optimal investment strategy in problems (13)we select the parameter 120577=075 the investment amount M =15 thousand dollars and the required rate of return 119877lowast = 005We also set the risk-free return rate as 0025 which is aboutequivalent to the average yield of T-Bills over the sameperiodAnd we use the MATLAB built-in solver ldquoquaprogrdquo to solvethe two portfolio optimization problems

Table 2 summarizes investment return rate of each testsubset and the average performance of the Prosper dataset Itshows that the two portfolios are almost always efficient andfeasible except subset 16The results also show that the actualperformances of the optimal portfolio derived from RIOMalways outperform the optimal portfolio from IOM Andthe Sharpe ratio shows that median-based optimal portfolioperforms better as well

Mathematical Problems in Engineering 7

1 1098765432The number of parameters set

IOMRIOM

0001002003004005006007008009

Retu

rn ra

te o

f inv

estm

ent

Figure 2 Performance comparison

Table 2 Rate of return from the optimal portfolio on the Prosperdataset

Subset IOM RIOM1 00501 005662 00550 006333 00540 006184 00564 006965 00627 007146 00543 006297 00532 006888 00605 007119 00593 0070610 00546 0066411 00637 0070112 00567 0064013 00468 0056914 00519 0066315 00544 0062016 00357 0047217 00588 0071018 00607 0077419 00544 0065520 00625 00808Average 00553 00662

In order to test and verify that the conclusions obtainedfrom the above experiments are stable we consider dif-ferent investment amounts and required returns as inputparameters for portfolio selection and keep other conditionsunchanged As summarized in Table 3 we consider nineparameters pairs about required return rate 119877lowast and invest-ment amount M

The computational results for each parameters pair aresummarized in Table 4 Table 4 shows performance compar-ison of the two optimal portfolios from the perspectives ofactual return rate of investment The more intuitive resultsare shown in Figure 2 which shows the actual return ratecomparison of the two models The first 9 numbers ofthe horizontal axis in Figure 2 represent the correspondingparameters combinations (sets 1 through 9 fromTable 3) and

Table 3 Investorsrsquo choices of input parameters for portfolio selec-tion

Set Investment amountM Required rate 119877lowast1 $10000 502 $10000 553 $10000 604 $15000 505 $15000 556 $15000 607 $20000 508 $20000 559 $20000 60

the number 10 shows the average We can find that the RIOMmodel outperforms the IOMmodel comprehensively

In conclusion the optimal portfolio identified from therobust optimization model in this study is more efficient thanthe existing model And the performance of our model ismore robust and stable

6 Conclusions

In this paper we formulate a data-driven robust modelof portfolio optimization with relative entropy constraintsbased on an instance-based credit risk assessment frameworkfor investment decisions in P2P lending This P2P lendinginvestment decision model has at least three advantagesFirstly it provides a more refined measure of P2P loansrsquo riskand reveals a more intuitive and quantized risk estimate toinvestors instead of just labelling each loan with a creditgrade Secondly this model can estimate each loanrsquos expectedreturn and risk when the historical observation of the sameborrower is unavailable Finally this model considers theloansrsquo distribution ambiguity (probability measure uncer-tainty) problem and uses relative entropy tomodel parameteruncertainty to ensure the optimal allocation strategy effi-cient and feasible under various actual scenarios Numericalexperiments imply that the P2P lending investment decisionmodel using the robust optimization with relative entropyconstraints provides better performance than existing model

8 Mathematical Problems in Engineering

Table4Investm

entp

erform

anceso

finp

utparametersfor

portfolio

selection

Subset

119877lowast=5

119877lowast

=55

119877lowast

=6

119877lowast=5

119877lowast

=55

119877lowast

=6

119877lowast

=5

119877lowast =

55

119877lowast =

6

M=10000

M=10000

M=10000

M=15000

M=15000

M=15000

M=20000

M=20000

M=20000

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

100598

007

27006

0100762

00502

007

2200501

005

6600558

008

3900520

008

3300594

007

7400544

006

8900691

006

492

00500

005

92006

01007

79006

75008

5100550

006

3300517

006

2000551

006

4800504

005

84006

64008

95006

6100769

3004

41005

7100491

006

9800735

009

3400540

006

1800598

007

7800631

006

4600503

005

6800554

007

37006

47008

694

00525

006

0200658

009

0300636

007

5400564

006

9600512

006

2900553

008

4700566

006

4800617

008

5600518

006

355

00532

006

2000631

008

2900513

007

3100627

007

1400566

007

2100616

008

9600576

007

0900547

006

9200610

007

716

00634

00747

00564

00762

00717

01105

00543

006

2900570

007

7200585

008

7400584

00 6

8 300528

007

0100516

003

857

00613

007

3600547

007

5400551

008

8400532

006

8800528

007

2700620

005

66004

81005

49004

85006

31004

60007

598

00529

005

9400505

006

16006

85008

58006

05007

1100545

00768

006

45008

5500545

007

0100628

008

2300592

008

459

00548

006

4700550

007

3600559

004

9300593

007

0600507

005

7600574

01214

00535

005

8600561

00764

00574

01038

10004

74005

74004

72006

3400499

007

9400546

006

6400528

006

3400622

006

3100514

005

9700582

006

83006

8900532

1100597

007

30006

02007

95006

6101090

00637

007

0100562

007

5600498

006

6200531

005

8400569

006

5700572

01141

12006

4400768

00541

006

7300624

01 042

00567

006

4000529

006

7700574

009

8300551

006

7800536

007

3400618

006

8713

00635

007

8500709

008

9500532

006

62004

68005

6900637

008

8000504

009

1300555

006

9500636

008

2400616

01157

1400593

00744

00626

007

5100634

01204

00519

006

6300568

007

1600614

01162

00577

006

7400541

006

3600572

008

1815

00523

006

36004

85006

0900571

009

8700544

006

2000577

00764

00633

008

0200597

007

7500536

007

0600595

007

0416

00549

007

05006

84008

9300508

01264

00357

004

72006

42008

5100573

005

4900593

006

9800616

008

0700551

00748

1700549

006

6600549

007

5700538

006

7700588

007

10006

74008

6700615

004

9600535

006

41004

87006

3600 69 6

009

1518

00546

006

2900512

006

1500560

006

10006

07007

7400585

007

32006

87008

2400599

007

2900576

00746

00507

01069

1900492

005

5500572

006

8500657

004

3600544

006

5500434

006

3300589

007

5900581

006

73004

72006

3800623

01148

2000554

006

45004

13005

0400596

003

6600625

008

0800562

006

8700698

009

7800518

007

09006

01007

3400638

00744

Average

00554

006

6400566

007

3200598

008

2300553

006

6200560

007

3100595

008

0700552

006

6300564

007

2900597

008

19

Mathematical Problems in Engineering 9

Data Availability

The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo

Acknowledgments

The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above

References

[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952

[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959

[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002

[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002

[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016

[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016

[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014

[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011

[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968

[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970

[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980

[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983

[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011

[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007

[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000

[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006

[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007

[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015

[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018

[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009

[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011

[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010

[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010

[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014

[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001

[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007

[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018

[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018

[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007

[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004

[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013

[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965

10 Mathematical Problems in Engineering

[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015

[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017

[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018

[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012

[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012

[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

2 Mathematical Problems in Engineering

individual investors just possess small amount of funds theyneed more refined risk assessment methods and investmentstrategies

Similar to bond investment P2P investors can fund aportion not the whole of each loan Therefore investors candecide which loans to invest and meanwhile determine theamount of investment for each loan This mechanism allowsinvestors to construct a credit portfolio to mitigate risk

Markowitz [1] proposes the famous mean-variancemodel which is still widely used in portfolio selection andrisk management From then on researchers propose avariety of mean-risk models such as mean-downside riskmodel [2] mean-VaR model [3] mean-CVaR model [4] andso on In practice the distribution of the assets needs tobe estimated firstly and then the optimal portfolio can beidentified by the optimization model

For P2P lending investment as mentioned above suchprocedures face at least two major challenges ie the defi-ciency of loansrsquo historical observations and the ambiguityproblem of estimated loansrsquo distribution (probabilitymeasureuncertainty problem) Thus this paper proposes a data-driven robust model of portfolio optimization based onrelative entropy constraints combined with an instance-basedcredit risk assessment method

Specifically we use the ldquoinstance-basedrdquo credit riskassessment method proposed by Guo et al [5] to evaluate thereturn and risk of each loan without sufficient historical dataof loans for each individual borrower In this instance-basedframework the expected return of each loan is predictedas a weighted average of historical loans of other similarborrowers where the optimal weights are learnt based onkernel regression Furthermore using the moment informa-tion (mean and variance) of the new loans we formulatethe robust portfolio optimization model with relative entropyconstraints which could obtain an optimal portfolio undertheworst scenario and has the ability of reducing the potentialloss caused by the uncertainty of loans distribution

Our work is somewhat related to the paper by Guo et al[5] and the paper by Yam et al [6] Guo et al [5] introducethe instance-based framework into credit risk assessmentof P2P loan and use the classical mean-variance model toobtain the optimal allocation Yam et al [6] derive a robustmean-variance optimization model with relative entropyconstrains on the uncertainty of the interaction between thereturns of different assets and discuss its mathematical andfinancial properties in portfolio selection Although someother scholars have contributed novel insights into creditrisk assessment of P2P lending and robust optimizationto the best of our knowledge few have taken both intoconsideration synthetically The main contribution of thispaper is that we propose a data-driven robust portfoliooptimization model based on relative entropy constraintscombined with instance-based risk assessment frameworkfor P2P loan investment and obtain superior performance innumerical experiments

The rest of this paper is organized as follows Section 2provides the literature review Section 3 introduces theinstance-based model for credit risk assessment as well as themathematical framework of kernel regression approach In

Section 4 we elaborate the robust optimization model basedon relative entropy method and formulate a robust mean-variance optimizationmodel for P2P lending investmentTheempirical results on the effectiveness of ourmodel is reportedin Section 5 Finally Section 6 concludes this work

2 Literature Review

In order to assess risk and assist investment decisions makingin P2P lending researchers have donemany studies Emekteret al [7] explore the dominated factors that explain thefunding success and credit risk and meanwhile measure theperformance of P2P loans They find that credit grade debt-to-income ratio FICO score and revolving line utilizationplay an important role in loan defaults furthermore loanswith lower credit grade and longer duration may result inhigh mortality rate and higher interest rates charged on lowcredit grade borrowers are not sufficient to cover the potentialloss for the higher likelihood of loan defaults Thus theauthors suggest that investors should invest more to highgrade loans Similarly Berkovichrsquos [8] study finds that highquality loans offer excess return

The above researches investigate the factors determiningthe credit risk and analyze the performance of P2P loanshowever they do not propose a mechanism which assistindividual investors in allocating loans effectively andmakingoptimal investment decisions

To help personal lenders mitigate the risk the popularonline P2P platforms like Lending Club and Prosper havedeveloped credit scoring systems to assess the creditworthi-ness of each borrower based on data mining or machinelearning techniques There is a large body of existing lit-eratures concerned with credit rating using data miningtechniques for example linear discriminate analysis (LDA)[9] k-nearest neighbors [10] logistic regression [11] classifi-cation and regression trees (CART) [12] Markov chains [13]survival analysis [14] artificial neural network (ANN) [15]genetic methods [16] support vector machine (SVM) [17 18]lasso-probit [19] and so on

In the portfolio selection problem full knowledge ofthe assetsrsquo distribution is usually assumed to determine theoptimal portfolio In most real-life applications we need toapproximate the assetsrsquo distribution However the approx-imations are not necessarily accurate and it is known asthe distribution ambiguity (probability measure uncertainty)problem

The robust optimization algorithm is an attractive wayto solve the portfolio selection problem under distributionambiguity As the exact parameters are unavailable Natarajanet al [20] use a set of parameters (which represent differentdistributions or scenarios) rather than a point estimationof the parameters to formulate the asset allocation prob-lem Following this idea there are different ways to modelambiguity by using a set of parameters Chen et al [21] takethe lower partial moments and CVaR as two risk measuresand consider a tight bound which are likely to cover thepossible parameters Epstein [22] considered intervals thatmay include the actual parameters Natarajan et al [23]use a piecewise-linear concave utility function to derive

Mathematical Problems in Engineering 3

accurate and estimated optimal strategies for the expectedutility model in the portfolio optimization issue under theworst-case scenarios Pac and Pinar [24] use an ellipsoidaluncertainty set to represent the distribution ambiguity toidentify the optimal portfolio

Since relative entropy has the ability to measure thedifference between two probability distributions (probabilitymeasures) it can be used to construct the uncertainty set forrobust optimization In the studies of Hansen and Sargent[25] and Calafiore [26] relative entropy is used to modeluncertainty and obtain the optimal investment decisionYam et al [6] derive a robust mean-variance optimizationmodel with relative entropy constrains on the uncertainty ofthe interaction between the returns of different assets anddiscuss its mathematical and financial properties in portfolioselection

In recent years research ondata-drivenmethods has beenwell studied In this framework it is assumed that investorsonly possess the information about history data of assetreturn Bertsimas et al [27] use KS test 1205942 test Anderson-Darling test and some other testing tools to construct uncer-tainty sets and take the worst case of each set to formulate therobust optimization They assume that the uncertainty setsare defined by certain structures and sizes based on the datapoints available While the structure of uncertainty set in ourstudy is not predefined we consider the uncertainty of meancovariance and distribution synthetically Kang et al [28]propose a data-driven robust mean-CVaR portfolio selectionmodel under the condition of distribution ambiguity andadopt a nonparametric bootstrap approach to calibrate thelevels of ambiguity Their work is based on the mean-CVaRframeworkwith data of stock indices while our work is basedon the mean-variance framework with data of P2P loans

3 Instance-Based Model forCredit Risk Assessment

Using historical data to evaluate future performance andpotential loss is a convention However unlike bonds orstocks investment the historical yield data about the sameP2P borrower is usually unavailableThus the risk assessmentof new loan is very challenging In this section we brieflyintroduce the instance-based credit risk assessment modelproposed by Guo et al [5]

31 Instance-Based Assessment Framework In this instance-based assessment framework the expected return of eachloan is estimated as a weighted average of historical observa-tions of other borrowersrsquo closed loans Specifically for a newloan i using n past loans each with an historical return 119877119895 (j= 1 2 n) we can calculate the expected retrun of loan i 120583119894based on a weighted average of past loansrsquo actual returns

120583119894 =119899sum119895=1

119908119894119895119877119895 (1)

where 119908119894119895 denotes the weight of loan j for predicting theexpected retrun of loan i The weight depends on thesimilarity between loan i and loan j Intuitively the more

the similarity the greater the weight The calculation of theweight will be introduced in Section 32

The weighted returns of the past loans are assumed ashistorical observations of a new loan According to this lineof thought taking variance as the risk measure weightedvariance of past loans are used to assess the new loanrsquos riskthat is

1205902119894 =119899sum119895=1

119908119894119895 (119877119895 minus 120583119894)2 (2)

where119908119894119895 119877119895 and 120583119894 have the same meanings as (1)The absolute deviation between two loansrsquo default prob-

abilities is used to measure the similarity the smaller theabsolute deviation themore the similarity and therefore thelarger the weight In particular absolute deviation of defaultprobabilities between loans i and j is defined as follows dij= |pi - pj| where pi and pj are the default probabilities ofloans i and j respectively Kernel regression is exploited toinvestigate the nonlinear relationship between the absolutedeviation and the weight This process will be introduced inthe next subsection

32 Kernel Regression of Return and Risk Kernel regressionis a nonparameter statistical method to investigate the non-linear relation between random variables which is based onthe kernel density estimation First of all the preliminaries ofkernel estimation are introduced

Given n realizations zj j = 1 n of random variable zthe kernel estimation 119901(119911) of the probability density functionp(z) is defined by

119901 (119911) = 1119899ℎ119899sum119895=1

119870(119911119895 minus 119911ℎ ) (3)

where K(sdot) is a kernel function and h is a smoothingparameter

Kernel function K(sdot) is nonnegative and bounded andmeanwhile satisfies the following properties

(a) intinfinminusinfin119870(119911)119889119911 = 1 (b) intinfin

minusinfin119911119870(119911)119889119911 = 0 (c)

intinfinminusinfin1199112119870(119911)119889119911 lt infinThere are a range of commonly used kernel functions

such as uniform triangular biweight triweight andGaussian[29] Because the kernel estimation is insensitive to the choiceof kernel function we use the Gaussian kernel function dueto its convenient mathematical properties which is written as119870(119911) = (1radic2120587)119890minus11991122

The smoothing parameter h=h(n) is also called thebandwidth that depends on the sample size n Specificallyh(n) and nsdoth(n) decrease to 0 as n tend toinfin

Many literatures reveal that the choice of kernel func-tion does not affect the estimation significantly howeverthe choice of the bandwidth is a vital issue [30 31] Thedetermination of the bandwidth will be shown in detail inSection 53

In the following we introduce the kernel regressionmodel proposed by Nadaraya [32] Theoretically we assumethat each observation is denoted as (X Y) which is a random

4 Mathematical Problems in Engineering

vector R2-valued With the sample set (xj yj)| j = 1 2119899 the kernel estimator 119910 of the target y given its predictiveobservation x is defined as

119910 = 119899sum119895=1

[[

119870((119909 minus 119909119895) ℎ)sum119899119895=1119870((119909 minus 119909119895) ℎ) sdot 119910119895

]] (4)

where K(sdot) is a kernel function and h is the bandwidthFor the instance-based credit risk modeling the set of

historical observations is represented as (pj Rj)| j = 1 2119899 where pj and Rj are the default probability and return rateof the jth loan respectively Thereby the estimation of the ithloanrsquos return could be written as

120583119894 =119899sum119895=1

[[

119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) sdot 119877119895

]] (5)

Note that the determination of loansrsquo default probability willbe introduced in Section 51

Comparing (1) to (5) we can represent the optimal weight119908119894119895 as

119908119894119895 = 119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) (6)

Using the optimal weight 119908119894119895 and the expected return 120583119894derived from (5) (2) can be rewritten as

2119894 =119899sum119895=1

[[

119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) sdot (119877119895 minus 120583119894)

2]] (7)

4 Robust Investment Decision Model

Similar to bond investment P2P lenders can invest a portionof each loan Thus P2P loan investment decisions can betransformed into a credit portfolio optimization problemThis section introduces the portfolio optimization model forinvestment decisions in P2P lending which accounts for theuncertainty of the distribution of the loans We start fromthe classical mean-variance optimization model proposed byMarkowitz [1] to its tractable robust counterpart

41 Robust Optimization Model Based on Relative EntropyConstraints In the classical mean-variance optimizationmodel the optimal asset allocation strategy is identified bysolving the tradeoff between risk and return according toinvestorsrsquo risk preference A portfolio that invests in n assets isrepresented as a vector of weights 120582 isin Rn where each weightdenotes the proportion of wealth allocated to an asset Thenthe return and risk of the portfolio become 120582T120583 and 120582T119881120582respectively where 120583 isin Rn and V isin Rntimesn are the expectedreturn and the covariance matrix of the assetsrsquo returnsunder the probability measure (or probability distribution)P respectively Here P represents the ideal estimated marketcondition where 120583 and V estimated by using all availableinformation including historical observations news expert

knowledge and so on are assumed as the actual expectedreturn and covariance matrix Thus the classical mean-variance portfolio selection problem (MV) can be formulatedas

(MV) min120582

120582T119881120582st 120582T120583 ge 119877lowast

120582 isin Ω(8)

whereΩ sube Rn denotes the set of feasible portfolios and 119877lowast isthe required return rate specified by the investor

In reality the assumption that the expected return 120583and covariance matrix V are known with certainty is lessreasonable It is quite possible that the estimated parametersare different with the actual ones Thus the optimal portfolioidentified by using the estimated inputs parameters 120583 andV directly may be inappropriate Robust optimization seeksfor portfolios that are insensitive to the uncertain in theparameters and the solutions that must be feasible no matterwhat the actual value of the parameters is

The investors might consider a set of probability mea-sures ie an uncertainty set to cover a range of scenariosbased on their assessments and then use robust optimizationto obtain approximate optimal strategies for the worst sce-narios within the uncertainty set In this paper we define Qas the set of probability measures representing the possiblescenarios 120583119876 and 119881119876 as the expected return and covariancematrix estimated under the probability measure 119876 isin QMathematically the robust counterpart of the classical mean-variance optimization problem (RMV) can be written as

(RMV) min120582

sup119876isinQ

120582T119881119876120582st inf

119876isinQ120582T120583119876 ge 119877lowast

120582 isin Ω(9)

It is rational to assume that the actual value of the parametersis in the neighborhood of the estimatorThus we can generatethe uncertainty set Q based on the assumption that themeasures in the set should be not far from the ideal measureP Relative entropy also known as the KullbackndashLeiblerdivergence can be used to measure the difference betweenprobability measures The relative entropy of the measure 119876in Q with respect to the measure P is

119863119870119871 (119876 119875) fl int119902 (119909) ln 119902 (119909)119901 (119909)119889119909 (10)

where 119901(119909) and 119902(119909) are the probability density functions(pdf) of the loansrsquo returns under probability measures P and119876 respectively In the context of mean-variance analysisrelative entropy 119863119870119871(119876 119875) can be rewritten as

119863119870119871 (119876 119875) = 12 [ln |119881| minus ln 10038161003816100381610038161198811198761003816100381610038161003816 + tr (119881minus1119881119876) minus 119899+ (120583 minus 120583119876)T119881minus1 (120583 minus 120583119876)]

(11)

Mathematical Problems in Engineering 5

where 120583 V 120583119876 and 119881119876 carry the same meaning as in (8) and(9) tr(V) |119881| and V be the trace the determinant and thetranspose of V respectively n is the amount of assets in theportfolio

Let U denote the set of parameters (120583119876 119881119876) under themeasure Q in Q Using the constraint of relative entropy wecan rewrite the robust optimization model (9) as

(RMV-RE) min120582

max(120583119876119881119876)isinU

120582T119881119876120582st min

(120583119876119881119876)isinU120582T120583119876 ge 119877lowast

119863119870119871 (119876 119875) le 119870120582 isin Ω

(12)

where K is a positive constant and determines the size ofuncertainty set Parameter K measures the level of uncer-tainty and reflects the investorsrsquo confidence in 120583 and Vestimated under probability measure P ie the greater Krsquosvalue the less confidence

Yam et al [6] prove that the robustmean-variance portfo-lio selection model based on relative entropy method (RMV-RE) can be formulated as quadratic optimization problemwhich is a tractable formulation and can be efficiently solvedThat is

min120582isinR119899

120582T119881 lowast 120582st 120582T120583lowast ge 119877lowast

120582 isin Ω(13)

Herein 120583lowast=120577120583 Vlowast=V+120577(1-120577)120583120583T and 120577 isin (0 1] is relatedto K in (12) closely which reflects the level of confidencein 120583 and V estimated under measure P For example 120577=1means that investors believe the estimated 120583 and V are thetrue parameters And as 120577 decreases the investorrsquos confidenceis weaker The details of the proof are referred to by Yam et al[6]

42 Robust Mean-Variance Portfolio Optimization Model inP2P Lending In the Section 32 we estimated each loanrsquosexpected return and variance of return ie 120583119894 and 120590119894 usingthe instance-based credit risk assessment model Let 120583 =(1205831 1205832 120583119899)T and

=[[[[[[[[[

1 0 00 2 d

d d 00 0 120590119899

]]]]]]]]]

(14)

denote the expected return vector and the covariance matrixof the loansrsquo returns under the probability measure P Herewe assume that the correlation between P2P loans is negligi-ble Now we can rewrite (13) as

Table 1 Description of variables

Variable DescriptionX1 FICO score of the borrower

X2The number of inquiries of the borrower in the last 6

monthsX3 Themonetary amount of the loan

X4The homeownership status of the borrower (0 = rent 1

= own)X5 The debt-to-income ratio of the borrowerX6 The number of accounts delinquentX7 The number of public records in the past 10 yearsY Dependent variable (0 = completed 1 = default)

min120582isinR119899

120582T ( + 120577 (1 minus 120577) 120583120583119879) 120582st 120582T (120577120583) ge 119877lowast

120582 isin Ω(15)

The feasible region Ω of our problem is defined by thefollowing constraints

(1) The value of the portfolio remains at its initial valueiesum119894 120582119894 = 1

(2) Short-selling is forbidden thus 120582119894 ge 0(3) For each loan the amount that lender can invest is

no more than the borrower request mi thereby 120582119894Mle mi where M is the total investment amount andinvestor has available

5 Empirical Analysis

In this section we investigate the validity of the robustmean-variance portfolio optimization model in P2P lending usingthe real-world dataset from a notable P2P lending platformProsper All numerical experiments are performed by usingMATLAB on PC

51 Data Description and Preprocess The dataset for empir-ical study is from a notable P2P lending platform in theUnited States Prosper It consists of 17001 loans including3039 default loans and 13908 completed loans whose issuedates within the period from November 2005 to March 2014

Using the data a credit scoring model is learnt to trans-form the loan attributes into the default probability The loanattributes are as follows the borrowerrsquos FICO score whichreflects borrowerrsquos creditworthiness the borrowerrsquos numberof inquiries in the past six months the monetary amountof the loan the homeownership status of the borrowerthe debt-to-income ratio of the borrower the borrowerrsquoscurrent delinquencies representing the number of accountsdelinquent and the borrowerrsquos number of public records inthe past 10 years (Row 1-7 in Table 1) The target variable isa binary variable (0 represents completed and 1 representsdefault) as described in Row 8 of Table 1

6 Mathematical Problems in Engineering

009500955

009600965

009700975

009800985

009900995

01

CV (h

)002 004 006 008 01 012 014 016 018 020

h

Figure 1 The curve of CV (h)

There exist many credit scoring models to predict thedefault probability of a loan such as Xgboost model [33ndash35] hybrid KMVmodel [36] credit scoring based on geneticalgorithms [37 38] and so on However discussing how tochoose and construct the optimal credit scoring model isbeyond the scope of this study and we use the most popularmodel logistic regression to make the prediction in thispreprocessing step

We randomly divide the dataset into two parts onecontaining 40 of all loans for determining the optimalbandwidth h in (5) which will be described in detail inSection 53 and the second part containing 60 of the loansMoreover using k-fold cross-validation we randomly dividethe second part into 20 subsets each of which containsapproximately 510 loans In each round one of the subsetsis used as the testing set which consists of loans waiting tobe invested thus their pay-back statuses are unknown andall other subsets are taken as a training set which consists ofhistorical loans with known yield

52 Model Description In this paper we propose a robustcredit portfolio optimization model for investment decisionsin P2P lending In order to show its effectiveness we compareit with a benchmark model proposed by Guo et al [5] In thefollowing we describe models in detail

IOM is the instance-based model proposed by Guo etal [5] Each loan is assessed using kernel weights and thehistorical performance of similar loans Then use the classicalmean-variance model (8) to identify the optimal allocationstrategy The performance of this model outperforms somerating-based models as the results of Guo et al [5] show

RIOM is the robust instance-based model in this studyExpected return and risk of each loan are also assessed basedon the ldquoinstance-basedrdquo assessment framework However weuse the robust model of credit portfolio optimization basedon relative entropy method Equation (15) to obtain theoptimal investment decision

We compare the two models by the following procedure(1) Train the credit risk assessment model with the

training set and use the trained model to predict theexpected return (120583119894) and variance (120590119894) of each loan inthe testing set Thus the expected return vector andthe covariance matrix 120583 and V can be obtained

(2) For each model feed the predicted expected returnvector 120583 and the covariance matrix 119881 of the testingloans into the portfolio optimization algorithm andcompute the performance of investment on the opti-mal portfolio

(3) Compare the return rate of the two models

53 Analysis of Results As mentioned before we select theGaussian kernel 119870(120577) = (1radic2120587)119890minus12057722 as the kernel func-tion And the important parameter in the kernel regressionmodel bandwidth h is optimized by the following leave-one-out cross validation

ℎ119900119901119905119894119898119886119897 = argminℎ119862119881 (ℎ)

= argminℎ

119899sum119894=1

(120583ℎ (119901minus119894) minus 120583119894)2 (16)

where 120583ℎ(119901minus119894) is the leave-one-out estimation of expectedreturn rate 120583119894 specifically

120583ℎ (119901minus119894) =119899sum119895=1119895 =119894

[[

119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119895 =119894119870((119901119894 minus 119901119895) ℎ) sdot 119877119895

]] (17)

The curve of CV(h) is exhibited in Figure 1 The shape of thecurve clearly shows a minimal point and h corresponding tothe minimal point is the optimal bandwith for the model

To apply the robust credit portfolio optimization methodto obtain the optimal investment strategy in problems (13)we select the parameter 120577=075 the investment amount M =15 thousand dollars and the required rate of return 119877lowast = 005We also set the risk-free return rate as 0025 which is aboutequivalent to the average yield of T-Bills over the sameperiodAnd we use the MATLAB built-in solver ldquoquaprogrdquo to solvethe two portfolio optimization problems

Table 2 summarizes investment return rate of each testsubset and the average performance of the Prosper dataset Itshows that the two portfolios are almost always efficient andfeasible except subset 16The results also show that the actualperformances of the optimal portfolio derived from RIOMalways outperform the optimal portfolio from IOM Andthe Sharpe ratio shows that median-based optimal portfolioperforms better as well

Mathematical Problems in Engineering 7

1 1098765432The number of parameters set

IOMRIOM

0001002003004005006007008009

Retu

rn ra

te o

f inv

estm

ent

Figure 2 Performance comparison

Table 2 Rate of return from the optimal portfolio on the Prosperdataset

Subset IOM RIOM1 00501 005662 00550 006333 00540 006184 00564 006965 00627 007146 00543 006297 00532 006888 00605 007119 00593 0070610 00546 0066411 00637 0070112 00567 0064013 00468 0056914 00519 0066315 00544 0062016 00357 0047217 00588 0071018 00607 0077419 00544 0065520 00625 00808Average 00553 00662

In order to test and verify that the conclusions obtainedfrom the above experiments are stable we consider dif-ferent investment amounts and required returns as inputparameters for portfolio selection and keep other conditionsunchanged As summarized in Table 3 we consider nineparameters pairs about required return rate 119877lowast and invest-ment amount M

The computational results for each parameters pair aresummarized in Table 4 Table 4 shows performance compar-ison of the two optimal portfolios from the perspectives ofactual return rate of investment The more intuitive resultsare shown in Figure 2 which shows the actual return ratecomparison of the two models The first 9 numbers ofthe horizontal axis in Figure 2 represent the correspondingparameters combinations (sets 1 through 9 fromTable 3) and

Table 3 Investorsrsquo choices of input parameters for portfolio selec-tion

Set Investment amountM Required rate 119877lowast1 $10000 502 $10000 553 $10000 604 $15000 505 $15000 556 $15000 607 $20000 508 $20000 559 $20000 60

the number 10 shows the average We can find that the RIOMmodel outperforms the IOMmodel comprehensively

In conclusion the optimal portfolio identified from therobust optimization model in this study is more efficient thanthe existing model And the performance of our model ismore robust and stable

6 Conclusions

In this paper we formulate a data-driven robust modelof portfolio optimization with relative entropy constraintsbased on an instance-based credit risk assessment frameworkfor investment decisions in P2P lending This P2P lendinginvestment decision model has at least three advantagesFirstly it provides a more refined measure of P2P loansrsquo riskand reveals a more intuitive and quantized risk estimate toinvestors instead of just labelling each loan with a creditgrade Secondly this model can estimate each loanrsquos expectedreturn and risk when the historical observation of the sameborrower is unavailable Finally this model considers theloansrsquo distribution ambiguity (probability measure uncer-tainty) problem and uses relative entropy tomodel parameteruncertainty to ensure the optimal allocation strategy effi-cient and feasible under various actual scenarios Numericalexperiments imply that the P2P lending investment decisionmodel using the robust optimization with relative entropyconstraints provides better performance than existing model

8 Mathematical Problems in Engineering

Table4Investm

entp

erform

anceso

finp

utparametersfor

portfolio

selection

Subset

119877lowast=5

119877lowast

=55

119877lowast

=6

119877lowast=5

119877lowast

=55

119877lowast

=6

119877lowast

=5

119877lowast =

55

119877lowast =

6

M=10000

M=10000

M=10000

M=15000

M=15000

M=15000

M=20000

M=20000

M=20000

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

100598

007

27006

0100762

00502

007

2200501

005

6600558

008

3900520

008

3300594

007

7400544

006

8900691

006

492

00500

005

92006

01007

79006

75008

5100550

006

3300517

006

2000551

006

4800504

005

84006

64008

95006

6100769

3004

41005

7100491

006

9800735

009

3400540

006

1800598

007

7800631

006

4600503

005

6800554

007

37006

47008

694

00525

006

0200658

009

0300636

007

5400564

006

9600512

006

2900553

008

4700566

006

4800617

008

5600518

006

355

00532

006

2000631

008

2900513

007

3100627

007

1400566

007

2100616

008

9600576

007

0900547

006

9200610

007

716

00634

00747

00564

00762

00717

01105

00543

006

2900570

007

7200585

008

7400584

00 6

8 300528

007

0100516

003

857

00613

007

3600547

007

5400551

008

8400532

006

8800528

007

2700620

005

66004

81005

49004

85006

31004

60007

598

00529

005

9400505

006

16006

85008

58006

05007

1100545

00768

006

45008

5500545

007

0100628

008

2300592

008

459

00548

006

4700550

007

3600559

004

9300593

007

0600507

005

7600574

01214

00535

005

8600561

00764

00574

01038

10004

74005

74004

72006

3400499

007

9400546

006

6400528

006

3400622

006

3100514

005

9700582

006

83006

8900532

1100597

007

30006

02007

95006

6101090

00637

007

0100562

007

5600498

006

6200531

005

8400569

006

5700572

01141

12006

4400768

00541

006

7300624

01 042

00567

006

4000529

006

7700574

009

8300551

006

7800536

007

3400618

006

8713

00635

007

8500709

008

9500532

006

62004

68005

6900637

008

8000504

009

1300555

006

9500636

008

2400616

01157

1400593

00744

00626

007

5100634

01204

00519

006

6300568

007

1600614

01162

00577

006

7400541

006

3600572

008

1815

00523

006

36004

85006

0900571

009

8700544

006

2000577

00764

00633

008

0200597

007

7500536

007

0600595

007

0416

00549

007

05006

84008

9300508

01264

00357

004

72006

42008

5100573

005

4900593

006

9800616

008

0700551

00748

1700549

006

6600549

007

5700538

006

7700588

007

10006

74008

6700615

004

9600535

006

41004

87006

3600 69 6

009

1518

00546

006

2900512

006

1500560

006

10006

07007

7400585

007

32006

87008

2400599

007

2900576

00746

00507

01069

1900492

005

5500572

006

8500657

004

3600544

006

5500434

006

3300589

007

5900581

006

73004

72006

3800623

01148

2000554

006

45004

13005

0400596

003

6600625

008

0800562

006

8700698

009

7800518

007

09006

01007

3400638

00744

Average

00554

006

6400566

007

3200598

008

2300553

006

6200560

007

3100595

008

0700552

006

6300564

007

2900597

008

19

Mathematical Problems in Engineering 9

Data Availability

The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo

Acknowledgments

The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above

References

[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952

[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959

[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002

[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002

[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016

[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016

[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014

[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011

[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968

[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970

[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980

[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983

[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011

[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007

[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000

[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006

[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007

[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015

[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018

[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009

[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011

[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010

[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010

[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014

[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001

[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007

[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018

[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018

[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007

[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004

[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013

[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965

10 Mathematical Problems in Engineering

[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015

[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017

[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018

[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012

[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012

[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Mathematical Problems in Engineering 3

accurate and estimated optimal strategies for the expectedutility model in the portfolio optimization issue under theworst-case scenarios Pac and Pinar [24] use an ellipsoidaluncertainty set to represent the distribution ambiguity toidentify the optimal portfolio

Since relative entropy has the ability to measure thedifference between two probability distributions (probabilitymeasures) it can be used to construct the uncertainty set forrobust optimization In the studies of Hansen and Sargent[25] and Calafiore [26] relative entropy is used to modeluncertainty and obtain the optimal investment decisionYam et al [6] derive a robust mean-variance optimizationmodel with relative entropy constrains on the uncertainty ofthe interaction between the returns of different assets anddiscuss its mathematical and financial properties in portfolioselection

In recent years research ondata-drivenmethods has beenwell studied In this framework it is assumed that investorsonly possess the information about history data of assetreturn Bertsimas et al [27] use KS test 1205942 test Anderson-Darling test and some other testing tools to construct uncer-tainty sets and take the worst case of each set to formulate therobust optimization They assume that the uncertainty setsare defined by certain structures and sizes based on the datapoints available While the structure of uncertainty set in ourstudy is not predefined we consider the uncertainty of meancovariance and distribution synthetically Kang et al [28]propose a data-driven robust mean-CVaR portfolio selectionmodel under the condition of distribution ambiguity andadopt a nonparametric bootstrap approach to calibrate thelevels of ambiguity Their work is based on the mean-CVaRframeworkwith data of stock indices while our work is basedon the mean-variance framework with data of P2P loans

3 Instance-Based Model forCredit Risk Assessment

Using historical data to evaluate future performance andpotential loss is a convention However unlike bonds orstocks investment the historical yield data about the sameP2P borrower is usually unavailableThus the risk assessmentof new loan is very challenging In this section we brieflyintroduce the instance-based credit risk assessment modelproposed by Guo et al [5]

31 Instance-Based Assessment Framework In this instance-based assessment framework the expected return of eachloan is estimated as a weighted average of historical observa-tions of other borrowersrsquo closed loans Specifically for a newloan i using n past loans each with an historical return 119877119895 (j= 1 2 n) we can calculate the expected retrun of loan i 120583119894based on a weighted average of past loansrsquo actual returns

120583119894 =119899sum119895=1

119908119894119895119877119895 (1)

where 119908119894119895 denotes the weight of loan j for predicting theexpected retrun of loan i The weight depends on thesimilarity between loan i and loan j Intuitively the more

the similarity the greater the weight The calculation of theweight will be introduced in Section 32

The weighted returns of the past loans are assumed ashistorical observations of a new loan According to this lineof thought taking variance as the risk measure weightedvariance of past loans are used to assess the new loanrsquos riskthat is

1205902119894 =119899sum119895=1

119908119894119895 (119877119895 minus 120583119894)2 (2)

where119908119894119895 119877119895 and 120583119894 have the same meanings as (1)The absolute deviation between two loansrsquo default prob-

abilities is used to measure the similarity the smaller theabsolute deviation themore the similarity and therefore thelarger the weight In particular absolute deviation of defaultprobabilities between loans i and j is defined as follows dij= |pi - pj| where pi and pj are the default probabilities ofloans i and j respectively Kernel regression is exploited toinvestigate the nonlinear relationship between the absolutedeviation and the weight This process will be introduced inthe next subsection

32 Kernel Regression of Return and Risk Kernel regressionis a nonparameter statistical method to investigate the non-linear relation between random variables which is based onthe kernel density estimation First of all the preliminaries ofkernel estimation are introduced

Given n realizations zj j = 1 n of random variable zthe kernel estimation 119901(119911) of the probability density functionp(z) is defined by

119901 (119911) = 1119899ℎ119899sum119895=1

119870(119911119895 minus 119911ℎ ) (3)

where K(sdot) is a kernel function and h is a smoothingparameter

Kernel function K(sdot) is nonnegative and bounded andmeanwhile satisfies the following properties

(a) intinfinminusinfin119870(119911)119889119911 = 1 (b) intinfin

minusinfin119911119870(119911)119889119911 = 0 (c)

intinfinminusinfin1199112119870(119911)119889119911 lt infinThere are a range of commonly used kernel functions

such as uniform triangular biweight triweight andGaussian[29] Because the kernel estimation is insensitive to the choiceof kernel function we use the Gaussian kernel function dueto its convenient mathematical properties which is written as119870(119911) = (1radic2120587)119890minus11991122

The smoothing parameter h=h(n) is also called thebandwidth that depends on the sample size n Specificallyh(n) and nsdoth(n) decrease to 0 as n tend toinfin

Many literatures reveal that the choice of kernel func-tion does not affect the estimation significantly howeverthe choice of the bandwidth is a vital issue [30 31] Thedetermination of the bandwidth will be shown in detail inSection 53

In the following we introduce the kernel regressionmodel proposed by Nadaraya [32] Theoretically we assumethat each observation is denoted as (X Y) which is a random

4 Mathematical Problems in Engineering

vector R2-valued With the sample set (xj yj)| j = 1 2119899 the kernel estimator 119910 of the target y given its predictiveobservation x is defined as

119910 = 119899sum119895=1

[[

119870((119909 minus 119909119895) ℎ)sum119899119895=1119870((119909 minus 119909119895) ℎ) sdot 119910119895

]] (4)

where K(sdot) is a kernel function and h is the bandwidthFor the instance-based credit risk modeling the set of

historical observations is represented as (pj Rj)| j = 1 2119899 where pj and Rj are the default probability and return rateof the jth loan respectively Thereby the estimation of the ithloanrsquos return could be written as

120583119894 =119899sum119895=1

[[

119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) sdot 119877119895

]] (5)

Note that the determination of loansrsquo default probability willbe introduced in Section 51

Comparing (1) to (5) we can represent the optimal weight119908119894119895 as

119908119894119895 = 119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) (6)

Using the optimal weight 119908119894119895 and the expected return 120583119894derived from (5) (2) can be rewritten as

2119894 =119899sum119895=1

[[

119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) sdot (119877119895 minus 120583119894)

2]] (7)

4 Robust Investment Decision Model

Similar to bond investment P2P lenders can invest a portionof each loan Thus P2P loan investment decisions can betransformed into a credit portfolio optimization problemThis section introduces the portfolio optimization model forinvestment decisions in P2P lending which accounts for theuncertainty of the distribution of the loans We start fromthe classical mean-variance optimization model proposed byMarkowitz [1] to its tractable robust counterpart

41 Robust Optimization Model Based on Relative EntropyConstraints In the classical mean-variance optimizationmodel the optimal asset allocation strategy is identified bysolving the tradeoff between risk and return according toinvestorsrsquo risk preference A portfolio that invests in n assets isrepresented as a vector of weights 120582 isin Rn where each weightdenotes the proportion of wealth allocated to an asset Thenthe return and risk of the portfolio become 120582T120583 and 120582T119881120582respectively where 120583 isin Rn and V isin Rntimesn are the expectedreturn and the covariance matrix of the assetsrsquo returnsunder the probability measure (or probability distribution)P respectively Here P represents the ideal estimated marketcondition where 120583 and V estimated by using all availableinformation including historical observations news expert

knowledge and so on are assumed as the actual expectedreturn and covariance matrix Thus the classical mean-variance portfolio selection problem (MV) can be formulatedas

(MV) min120582

120582T119881120582st 120582T120583 ge 119877lowast

120582 isin Ω(8)

whereΩ sube Rn denotes the set of feasible portfolios and 119877lowast isthe required return rate specified by the investor

In reality the assumption that the expected return 120583and covariance matrix V are known with certainty is lessreasonable It is quite possible that the estimated parametersare different with the actual ones Thus the optimal portfolioidentified by using the estimated inputs parameters 120583 andV directly may be inappropriate Robust optimization seeksfor portfolios that are insensitive to the uncertain in theparameters and the solutions that must be feasible no matterwhat the actual value of the parameters is

The investors might consider a set of probability mea-sures ie an uncertainty set to cover a range of scenariosbased on their assessments and then use robust optimizationto obtain approximate optimal strategies for the worst sce-narios within the uncertainty set In this paper we define Qas the set of probability measures representing the possiblescenarios 120583119876 and 119881119876 as the expected return and covariancematrix estimated under the probability measure 119876 isin QMathematically the robust counterpart of the classical mean-variance optimization problem (RMV) can be written as

(RMV) min120582

sup119876isinQ

120582T119881119876120582st inf

119876isinQ120582T120583119876 ge 119877lowast

120582 isin Ω(9)

It is rational to assume that the actual value of the parametersis in the neighborhood of the estimatorThus we can generatethe uncertainty set Q based on the assumption that themeasures in the set should be not far from the ideal measureP Relative entropy also known as the KullbackndashLeiblerdivergence can be used to measure the difference betweenprobability measures The relative entropy of the measure 119876in Q with respect to the measure P is

119863119870119871 (119876 119875) fl int119902 (119909) ln 119902 (119909)119901 (119909)119889119909 (10)

where 119901(119909) and 119902(119909) are the probability density functions(pdf) of the loansrsquo returns under probability measures P and119876 respectively In the context of mean-variance analysisrelative entropy 119863119870119871(119876 119875) can be rewritten as

119863119870119871 (119876 119875) = 12 [ln |119881| minus ln 10038161003816100381610038161198811198761003816100381610038161003816 + tr (119881minus1119881119876) minus 119899+ (120583 minus 120583119876)T119881minus1 (120583 minus 120583119876)]

(11)

Mathematical Problems in Engineering 5

where 120583 V 120583119876 and 119881119876 carry the same meaning as in (8) and(9) tr(V) |119881| and V be the trace the determinant and thetranspose of V respectively n is the amount of assets in theportfolio

Let U denote the set of parameters (120583119876 119881119876) under themeasure Q in Q Using the constraint of relative entropy wecan rewrite the robust optimization model (9) as

(RMV-RE) min120582

max(120583119876119881119876)isinU

120582T119881119876120582st min

(120583119876119881119876)isinU120582T120583119876 ge 119877lowast

119863119870119871 (119876 119875) le 119870120582 isin Ω

(12)

where K is a positive constant and determines the size ofuncertainty set Parameter K measures the level of uncer-tainty and reflects the investorsrsquo confidence in 120583 and Vestimated under probability measure P ie the greater Krsquosvalue the less confidence

Yam et al [6] prove that the robustmean-variance portfo-lio selection model based on relative entropy method (RMV-RE) can be formulated as quadratic optimization problemwhich is a tractable formulation and can be efficiently solvedThat is

min120582isinR119899

120582T119881 lowast 120582st 120582T120583lowast ge 119877lowast

120582 isin Ω(13)

Herein 120583lowast=120577120583 Vlowast=V+120577(1-120577)120583120583T and 120577 isin (0 1] is relatedto K in (12) closely which reflects the level of confidencein 120583 and V estimated under measure P For example 120577=1means that investors believe the estimated 120583 and V are thetrue parameters And as 120577 decreases the investorrsquos confidenceis weaker The details of the proof are referred to by Yam et al[6]

42 Robust Mean-Variance Portfolio Optimization Model inP2P Lending In the Section 32 we estimated each loanrsquosexpected return and variance of return ie 120583119894 and 120590119894 usingthe instance-based credit risk assessment model Let 120583 =(1205831 1205832 120583119899)T and

=[[[[[[[[[

1 0 00 2 d

d d 00 0 120590119899

]]]]]]]]]

(14)

denote the expected return vector and the covariance matrixof the loansrsquo returns under the probability measure P Herewe assume that the correlation between P2P loans is negligi-ble Now we can rewrite (13) as

Table 1 Description of variables

Variable DescriptionX1 FICO score of the borrower

X2The number of inquiries of the borrower in the last 6

monthsX3 Themonetary amount of the loan

X4The homeownership status of the borrower (0 = rent 1

= own)X5 The debt-to-income ratio of the borrowerX6 The number of accounts delinquentX7 The number of public records in the past 10 yearsY Dependent variable (0 = completed 1 = default)

min120582isinR119899

120582T ( + 120577 (1 minus 120577) 120583120583119879) 120582st 120582T (120577120583) ge 119877lowast

120582 isin Ω(15)

The feasible region Ω of our problem is defined by thefollowing constraints

(1) The value of the portfolio remains at its initial valueiesum119894 120582119894 = 1

(2) Short-selling is forbidden thus 120582119894 ge 0(3) For each loan the amount that lender can invest is

no more than the borrower request mi thereby 120582119894Mle mi where M is the total investment amount andinvestor has available

5 Empirical Analysis

In this section we investigate the validity of the robustmean-variance portfolio optimization model in P2P lending usingthe real-world dataset from a notable P2P lending platformProsper All numerical experiments are performed by usingMATLAB on PC

51 Data Description and Preprocess The dataset for empir-ical study is from a notable P2P lending platform in theUnited States Prosper It consists of 17001 loans including3039 default loans and 13908 completed loans whose issuedates within the period from November 2005 to March 2014

Using the data a credit scoring model is learnt to trans-form the loan attributes into the default probability The loanattributes are as follows the borrowerrsquos FICO score whichreflects borrowerrsquos creditworthiness the borrowerrsquos numberof inquiries in the past six months the monetary amountof the loan the homeownership status of the borrowerthe debt-to-income ratio of the borrower the borrowerrsquoscurrent delinquencies representing the number of accountsdelinquent and the borrowerrsquos number of public records inthe past 10 years (Row 1-7 in Table 1) The target variable isa binary variable (0 represents completed and 1 representsdefault) as described in Row 8 of Table 1

6 Mathematical Problems in Engineering

009500955

009600965

009700975

009800985

009900995

01

CV (h

)002 004 006 008 01 012 014 016 018 020

h

Figure 1 The curve of CV (h)

There exist many credit scoring models to predict thedefault probability of a loan such as Xgboost model [33ndash35] hybrid KMVmodel [36] credit scoring based on geneticalgorithms [37 38] and so on However discussing how tochoose and construct the optimal credit scoring model isbeyond the scope of this study and we use the most popularmodel logistic regression to make the prediction in thispreprocessing step

We randomly divide the dataset into two parts onecontaining 40 of all loans for determining the optimalbandwidth h in (5) which will be described in detail inSection 53 and the second part containing 60 of the loansMoreover using k-fold cross-validation we randomly dividethe second part into 20 subsets each of which containsapproximately 510 loans In each round one of the subsetsis used as the testing set which consists of loans waiting tobe invested thus their pay-back statuses are unknown andall other subsets are taken as a training set which consists ofhistorical loans with known yield

52 Model Description In this paper we propose a robustcredit portfolio optimization model for investment decisionsin P2P lending In order to show its effectiveness we compareit with a benchmark model proposed by Guo et al [5] In thefollowing we describe models in detail

IOM is the instance-based model proposed by Guo etal [5] Each loan is assessed using kernel weights and thehistorical performance of similar loans Then use the classicalmean-variance model (8) to identify the optimal allocationstrategy The performance of this model outperforms somerating-based models as the results of Guo et al [5] show

RIOM is the robust instance-based model in this studyExpected return and risk of each loan are also assessed basedon the ldquoinstance-basedrdquo assessment framework However weuse the robust model of credit portfolio optimization basedon relative entropy method Equation (15) to obtain theoptimal investment decision

We compare the two models by the following procedure(1) Train the credit risk assessment model with the

training set and use the trained model to predict theexpected return (120583119894) and variance (120590119894) of each loan inthe testing set Thus the expected return vector andthe covariance matrix 120583 and V can be obtained

(2) For each model feed the predicted expected returnvector 120583 and the covariance matrix 119881 of the testingloans into the portfolio optimization algorithm andcompute the performance of investment on the opti-mal portfolio

(3) Compare the return rate of the two models

53 Analysis of Results As mentioned before we select theGaussian kernel 119870(120577) = (1radic2120587)119890minus12057722 as the kernel func-tion And the important parameter in the kernel regressionmodel bandwidth h is optimized by the following leave-one-out cross validation

ℎ119900119901119905119894119898119886119897 = argminℎ119862119881 (ℎ)

= argminℎ

119899sum119894=1

(120583ℎ (119901minus119894) minus 120583119894)2 (16)

where 120583ℎ(119901minus119894) is the leave-one-out estimation of expectedreturn rate 120583119894 specifically

120583ℎ (119901minus119894) =119899sum119895=1119895 =119894

[[

119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119895 =119894119870((119901119894 minus 119901119895) ℎ) sdot 119877119895

]] (17)

The curve of CV(h) is exhibited in Figure 1 The shape of thecurve clearly shows a minimal point and h corresponding tothe minimal point is the optimal bandwith for the model

To apply the robust credit portfolio optimization methodto obtain the optimal investment strategy in problems (13)we select the parameter 120577=075 the investment amount M =15 thousand dollars and the required rate of return 119877lowast = 005We also set the risk-free return rate as 0025 which is aboutequivalent to the average yield of T-Bills over the sameperiodAnd we use the MATLAB built-in solver ldquoquaprogrdquo to solvethe two portfolio optimization problems

Table 2 summarizes investment return rate of each testsubset and the average performance of the Prosper dataset Itshows that the two portfolios are almost always efficient andfeasible except subset 16The results also show that the actualperformances of the optimal portfolio derived from RIOMalways outperform the optimal portfolio from IOM Andthe Sharpe ratio shows that median-based optimal portfolioperforms better as well

Mathematical Problems in Engineering 7

1 1098765432The number of parameters set

IOMRIOM

0001002003004005006007008009

Retu

rn ra

te o

f inv

estm

ent

Figure 2 Performance comparison

Table 2 Rate of return from the optimal portfolio on the Prosperdataset

Subset IOM RIOM1 00501 005662 00550 006333 00540 006184 00564 006965 00627 007146 00543 006297 00532 006888 00605 007119 00593 0070610 00546 0066411 00637 0070112 00567 0064013 00468 0056914 00519 0066315 00544 0062016 00357 0047217 00588 0071018 00607 0077419 00544 0065520 00625 00808Average 00553 00662

In order to test and verify that the conclusions obtainedfrom the above experiments are stable we consider dif-ferent investment amounts and required returns as inputparameters for portfolio selection and keep other conditionsunchanged As summarized in Table 3 we consider nineparameters pairs about required return rate 119877lowast and invest-ment amount M

The computational results for each parameters pair aresummarized in Table 4 Table 4 shows performance compar-ison of the two optimal portfolios from the perspectives ofactual return rate of investment The more intuitive resultsare shown in Figure 2 which shows the actual return ratecomparison of the two models The first 9 numbers ofthe horizontal axis in Figure 2 represent the correspondingparameters combinations (sets 1 through 9 fromTable 3) and

Table 3 Investorsrsquo choices of input parameters for portfolio selec-tion

Set Investment amountM Required rate 119877lowast1 $10000 502 $10000 553 $10000 604 $15000 505 $15000 556 $15000 607 $20000 508 $20000 559 $20000 60

the number 10 shows the average We can find that the RIOMmodel outperforms the IOMmodel comprehensively

In conclusion the optimal portfolio identified from therobust optimization model in this study is more efficient thanthe existing model And the performance of our model ismore robust and stable

6 Conclusions

In this paper we formulate a data-driven robust modelof portfolio optimization with relative entropy constraintsbased on an instance-based credit risk assessment frameworkfor investment decisions in P2P lending This P2P lendinginvestment decision model has at least three advantagesFirstly it provides a more refined measure of P2P loansrsquo riskand reveals a more intuitive and quantized risk estimate toinvestors instead of just labelling each loan with a creditgrade Secondly this model can estimate each loanrsquos expectedreturn and risk when the historical observation of the sameborrower is unavailable Finally this model considers theloansrsquo distribution ambiguity (probability measure uncer-tainty) problem and uses relative entropy tomodel parameteruncertainty to ensure the optimal allocation strategy effi-cient and feasible under various actual scenarios Numericalexperiments imply that the P2P lending investment decisionmodel using the robust optimization with relative entropyconstraints provides better performance than existing model

8 Mathematical Problems in Engineering

Table4Investm

entp

erform

anceso

finp

utparametersfor

portfolio

selection

Subset

119877lowast=5

119877lowast

=55

119877lowast

=6

119877lowast=5

119877lowast

=55

119877lowast

=6

119877lowast

=5

119877lowast =

55

119877lowast =

6

M=10000

M=10000

M=10000

M=15000

M=15000

M=15000

M=20000

M=20000

M=20000

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

100598

007

27006

0100762

00502

007

2200501

005

6600558

008

3900520

008

3300594

007

7400544

006

8900691

006

492

00500

005

92006

01007

79006

75008

5100550

006

3300517

006

2000551

006

4800504

005

84006

64008

95006

6100769

3004

41005

7100491

006

9800735

009

3400540

006

1800598

007

7800631

006

4600503

005

6800554

007

37006

47008

694

00525

006

0200658

009

0300636

007

5400564

006

9600512

006

2900553

008

4700566

006

4800617

008

5600518

006

355

00532

006

2000631

008

2900513

007

3100627

007

1400566

007

2100616

008

9600576

007

0900547

006

9200610

007

716

00634

00747

00564

00762

00717

01105

00543

006

2900570

007

7200585

008

7400584

00 6

8 300528

007

0100516

003

857

00613

007

3600547

007

5400551

008

8400532

006

8800528

007

2700620

005

66004

81005

49004

85006

31004

60007

598

00529

005

9400505

006

16006

85008

58006

05007

1100545

00768

006

45008

5500545

007

0100628

008

2300592

008

459

00548

006

4700550

007

3600559

004

9300593

007

0600507

005

7600574

01214

00535

005

8600561

00764

00574

01038

10004

74005

74004

72006

3400499

007

9400546

006

6400528

006

3400622

006

3100514

005

9700582

006

83006

8900532

1100597

007

30006

02007

95006

6101090

00637

007

0100562

007

5600498

006

6200531

005

8400569

006

5700572

01141

12006

4400768

00541

006

7300624

01 042

00567

006

4000529

006

7700574

009

8300551

006

7800536

007

3400618

006

8713

00635

007

8500709

008

9500532

006

62004

68005

6900637

008

8000504

009

1300555

006

9500636

008

2400616

01157

1400593

00744

00626

007

5100634

01204

00519

006

6300568

007

1600614

01162

00577

006

7400541

006

3600572

008

1815

00523

006

36004

85006

0900571

009

8700544

006

2000577

00764

00633

008

0200597

007

7500536

007

0600595

007

0416

00549

007

05006

84008

9300508

01264

00357

004

72006

42008

5100573

005

4900593

006

9800616

008

0700551

00748

1700549

006

6600549

007

5700538

006

7700588

007

10006

74008

6700615

004

9600535

006

41004

87006

3600 69 6

009

1518

00546

006

2900512

006

1500560

006

10006

07007

7400585

007

32006

87008

2400599

007

2900576

00746

00507

01069

1900492

005

5500572

006

8500657

004

3600544

006

5500434

006

3300589

007

5900581

006

73004

72006

3800623

01148

2000554

006

45004

13005

0400596

003

6600625

008

0800562

006

8700698

009

7800518

007

09006

01007

3400638

00744

Average

00554

006

6400566

007

3200598

008

2300553

006

6200560

007

3100595

008

0700552

006

6300564

007

2900597

008

19

Mathematical Problems in Engineering 9

Data Availability

The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo

Acknowledgments

The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above

References

[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952

[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959

[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002

[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002

[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016

[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016

[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014

[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011

[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968

[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970

[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980

[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983

[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011

[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007

[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000

[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006

[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007

[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015

[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018

[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009

[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011

[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010

[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010

[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014

[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001

[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007

[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018

[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018

[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007

[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004

[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013

[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965

10 Mathematical Problems in Engineering

[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015

[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017

[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018

[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012

[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012

[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

4 Mathematical Problems in Engineering

vector R2-valued With the sample set (xj yj)| j = 1 2119899 the kernel estimator 119910 of the target y given its predictiveobservation x is defined as

119910 = 119899sum119895=1

[[

119870((119909 minus 119909119895) ℎ)sum119899119895=1119870((119909 minus 119909119895) ℎ) sdot 119910119895

]] (4)

where K(sdot) is a kernel function and h is the bandwidthFor the instance-based credit risk modeling the set of

historical observations is represented as (pj Rj)| j = 1 2119899 where pj and Rj are the default probability and return rateof the jth loan respectively Thereby the estimation of the ithloanrsquos return could be written as

120583119894 =119899sum119895=1

[[

119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) sdot 119877119895

]] (5)

Note that the determination of loansrsquo default probability willbe introduced in Section 51

Comparing (1) to (5) we can represent the optimal weight119908119894119895 as

119908119894119895 = 119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) (6)

Using the optimal weight 119908119894119895 and the expected return 120583119894derived from (5) (2) can be rewritten as

2119894 =119899sum119895=1

[[

119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) sdot (119877119895 minus 120583119894)

2]] (7)

4 Robust Investment Decision Model

Similar to bond investment P2P lenders can invest a portionof each loan Thus P2P loan investment decisions can betransformed into a credit portfolio optimization problemThis section introduces the portfolio optimization model forinvestment decisions in P2P lending which accounts for theuncertainty of the distribution of the loans We start fromthe classical mean-variance optimization model proposed byMarkowitz [1] to its tractable robust counterpart

41 Robust Optimization Model Based on Relative EntropyConstraints In the classical mean-variance optimizationmodel the optimal asset allocation strategy is identified bysolving the tradeoff between risk and return according toinvestorsrsquo risk preference A portfolio that invests in n assets isrepresented as a vector of weights 120582 isin Rn where each weightdenotes the proportion of wealth allocated to an asset Thenthe return and risk of the portfolio become 120582T120583 and 120582T119881120582respectively where 120583 isin Rn and V isin Rntimesn are the expectedreturn and the covariance matrix of the assetsrsquo returnsunder the probability measure (or probability distribution)P respectively Here P represents the ideal estimated marketcondition where 120583 and V estimated by using all availableinformation including historical observations news expert

knowledge and so on are assumed as the actual expectedreturn and covariance matrix Thus the classical mean-variance portfolio selection problem (MV) can be formulatedas

(MV) min120582

120582T119881120582st 120582T120583 ge 119877lowast

120582 isin Ω(8)

whereΩ sube Rn denotes the set of feasible portfolios and 119877lowast isthe required return rate specified by the investor

In reality the assumption that the expected return 120583and covariance matrix V are known with certainty is lessreasonable It is quite possible that the estimated parametersare different with the actual ones Thus the optimal portfolioidentified by using the estimated inputs parameters 120583 andV directly may be inappropriate Robust optimization seeksfor portfolios that are insensitive to the uncertain in theparameters and the solutions that must be feasible no matterwhat the actual value of the parameters is

The investors might consider a set of probability mea-sures ie an uncertainty set to cover a range of scenariosbased on their assessments and then use robust optimizationto obtain approximate optimal strategies for the worst sce-narios within the uncertainty set In this paper we define Qas the set of probability measures representing the possiblescenarios 120583119876 and 119881119876 as the expected return and covariancematrix estimated under the probability measure 119876 isin QMathematically the robust counterpart of the classical mean-variance optimization problem (RMV) can be written as

(RMV) min120582

sup119876isinQ

120582T119881119876120582st inf

119876isinQ120582T120583119876 ge 119877lowast

120582 isin Ω(9)

It is rational to assume that the actual value of the parametersis in the neighborhood of the estimatorThus we can generatethe uncertainty set Q based on the assumption that themeasures in the set should be not far from the ideal measureP Relative entropy also known as the KullbackndashLeiblerdivergence can be used to measure the difference betweenprobability measures The relative entropy of the measure 119876in Q with respect to the measure P is

119863119870119871 (119876 119875) fl int119902 (119909) ln 119902 (119909)119901 (119909)119889119909 (10)

where 119901(119909) and 119902(119909) are the probability density functions(pdf) of the loansrsquo returns under probability measures P and119876 respectively In the context of mean-variance analysisrelative entropy 119863119870119871(119876 119875) can be rewritten as

119863119870119871 (119876 119875) = 12 [ln |119881| minus ln 10038161003816100381610038161198811198761003816100381610038161003816 + tr (119881minus1119881119876) minus 119899+ (120583 minus 120583119876)T119881minus1 (120583 minus 120583119876)]

(11)

Mathematical Problems in Engineering 5

where 120583 V 120583119876 and 119881119876 carry the same meaning as in (8) and(9) tr(V) |119881| and V be the trace the determinant and thetranspose of V respectively n is the amount of assets in theportfolio

Let U denote the set of parameters (120583119876 119881119876) under themeasure Q in Q Using the constraint of relative entropy wecan rewrite the robust optimization model (9) as

(RMV-RE) min120582

max(120583119876119881119876)isinU

120582T119881119876120582st min

(120583119876119881119876)isinU120582T120583119876 ge 119877lowast

119863119870119871 (119876 119875) le 119870120582 isin Ω

(12)

where K is a positive constant and determines the size ofuncertainty set Parameter K measures the level of uncer-tainty and reflects the investorsrsquo confidence in 120583 and Vestimated under probability measure P ie the greater Krsquosvalue the less confidence

Yam et al [6] prove that the robustmean-variance portfo-lio selection model based on relative entropy method (RMV-RE) can be formulated as quadratic optimization problemwhich is a tractable formulation and can be efficiently solvedThat is

min120582isinR119899

120582T119881 lowast 120582st 120582T120583lowast ge 119877lowast

120582 isin Ω(13)

Herein 120583lowast=120577120583 Vlowast=V+120577(1-120577)120583120583T and 120577 isin (0 1] is relatedto K in (12) closely which reflects the level of confidencein 120583 and V estimated under measure P For example 120577=1means that investors believe the estimated 120583 and V are thetrue parameters And as 120577 decreases the investorrsquos confidenceis weaker The details of the proof are referred to by Yam et al[6]

42 Robust Mean-Variance Portfolio Optimization Model inP2P Lending In the Section 32 we estimated each loanrsquosexpected return and variance of return ie 120583119894 and 120590119894 usingthe instance-based credit risk assessment model Let 120583 =(1205831 1205832 120583119899)T and

=[[[[[[[[[

1 0 00 2 d

d d 00 0 120590119899

]]]]]]]]]

(14)

denote the expected return vector and the covariance matrixof the loansrsquo returns under the probability measure P Herewe assume that the correlation between P2P loans is negligi-ble Now we can rewrite (13) as

Table 1 Description of variables

Variable DescriptionX1 FICO score of the borrower

X2The number of inquiries of the borrower in the last 6

monthsX3 Themonetary amount of the loan

X4The homeownership status of the borrower (0 = rent 1

= own)X5 The debt-to-income ratio of the borrowerX6 The number of accounts delinquentX7 The number of public records in the past 10 yearsY Dependent variable (0 = completed 1 = default)

min120582isinR119899

120582T ( + 120577 (1 minus 120577) 120583120583119879) 120582st 120582T (120577120583) ge 119877lowast

120582 isin Ω(15)

The feasible region Ω of our problem is defined by thefollowing constraints

(1) The value of the portfolio remains at its initial valueiesum119894 120582119894 = 1

(2) Short-selling is forbidden thus 120582119894 ge 0(3) For each loan the amount that lender can invest is

no more than the borrower request mi thereby 120582119894Mle mi where M is the total investment amount andinvestor has available

5 Empirical Analysis

In this section we investigate the validity of the robustmean-variance portfolio optimization model in P2P lending usingthe real-world dataset from a notable P2P lending platformProsper All numerical experiments are performed by usingMATLAB on PC

51 Data Description and Preprocess The dataset for empir-ical study is from a notable P2P lending platform in theUnited States Prosper It consists of 17001 loans including3039 default loans and 13908 completed loans whose issuedates within the period from November 2005 to March 2014

Using the data a credit scoring model is learnt to trans-form the loan attributes into the default probability The loanattributes are as follows the borrowerrsquos FICO score whichreflects borrowerrsquos creditworthiness the borrowerrsquos numberof inquiries in the past six months the monetary amountof the loan the homeownership status of the borrowerthe debt-to-income ratio of the borrower the borrowerrsquoscurrent delinquencies representing the number of accountsdelinquent and the borrowerrsquos number of public records inthe past 10 years (Row 1-7 in Table 1) The target variable isa binary variable (0 represents completed and 1 representsdefault) as described in Row 8 of Table 1

6 Mathematical Problems in Engineering

009500955

009600965

009700975

009800985

009900995

01

CV (h

)002 004 006 008 01 012 014 016 018 020

h

Figure 1 The curve of CV (h)

There exist many credit scoring models to predict thedefault probability of a loan such as Xgboost model [33ndash35] hybrid KMVmodel [36] credit scoring based on geneticalgorithms [37 38] and so on However discussing how tochoose and construct the optimal credit scoring model isbeyond the scope of this study and we use the most popularmodel logistic regression to make the prediction in thispreprocessing step

We randomly divide the dataset into two parts onecontaining 40 of all loans for determining the optimalbandwidth h in (5) which will be described in detail inSection 53 and the second part containing 60 of the loansMoreover using k-fold cross-validation we randomly dividethe second part into 20 subsets each of which containsapproximately 510 loans In each round one of the subsetsis used as the testing set which consists of loans waiting tobe invested thus their pay-back statuses are unknown andall other subsets are taken as a training set which consists ofhistorical loans with known yield

52 Model Description In this paper we propose a robustcredit portfolio optimization model for investment decisionsin P2P lending In order to show its effectiveness we compareit with a benchmark model proposed by Guo et al [5] In thefollowing we describe models in detail

IOM is the instance-based model proposed by Guo etal [5] Each loan is assessed using kernel weights and thehistorical performance of similar loans Then use the classicalmean-variance model (8) to identify the optimal allocationstrategy The performance of this model outperforms somerating-based models as the results of Guo et al [5] show

RIOM is the robust instance-based model in this studyExpected return and risk of each loan are also assessed basedon the ldquoinstance-basedrdquo assessment framework However weuse the robust model of credit portfolio optimization basedon relative entropy method Equation (15) to obtain theoptimal investment decision

We compare the two models by the following procedure(1) Train the credit risk assessment model with the

training set and use the trained model to predict theexpected return (120583119894) and variance (120590119894) of each loan inthe testing set Thus the expected return vector andthe covariance matrix 120583 and V can be obtained

(2) For each model feed the predicted expected returnvector 120583 and the covariance matrix 119881 of the testingloans into the portfolio optimization algorithm andcompute the performance of investment on the opti-mal portfolio

(3) Compare the return rate of the two models

53 Analysis of Results As mentioned before we select theGaussian kernel 119870(120577) = (1radic2120587)119890minus12057722 as the kernel func-tion And the important parameter in the kernel regressionmodel bandwidth h is optimized by the following leave-one-out cross validation

ℎ119900119901119905119894119898119886119897 = argminℎ119862119881 (ℎ)

= argminℎ

119899sum119894=1

(120583ℎ (119901minus119894) minus 120583119894)2 (16)

where 120583ℎ(119901minus119894) is the leave-one-out estimation of expectedreturn rate 120583119894 specifically

120583ℎ (119901minus119894) =119899sum119895=1119895 =119894

[[

119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119895 =119894119870((119901119894 minus 119901119895) ℎ) sdot 119877119895

]] (17)

The curve of CV(h) is exhibited in Figure 1 The shape of thecurve clearly shows a minimal point and h corresponding tothe minimal point is the optimal bandwith for the model

To apply the robust credit portfolio optimization methodto obtain the optimal investment strategy in problems (13)we select the parameter 120577=075 the investment amount M =15 thousand dollars and the required rate of return 119877lowast = 005We also set the risk-free return rate as 0025 which is aboutequivalent to the average yield of T-Bills over the sameperiodAnd we use the MATLAB built-in solver ldquoquaprogrdquo to solvethe two portfolio optimization problems

Table 2 summarizes investment return rate of each testsubset and the average performance of the Prosper dataset Itshows that the two portfolios are almost always efficient andfeasible except subset 16The results also show that the actualperformances of the optimal portfolio derived from RIOMalways outperform the optimal portfolio from IOM Andthe Sharpe ratio shows that median-based optimal portfolioperforms better as well

Mathematical Problems in Engineering 7

1 1098765432The number of parameters set

IOMRIOM

0001002003004005006007008009

Retu

rn ra

te o

f inv

estm

ent

Figure 2 Performance comparison

Table 2 Rate of return from the optimal portfolio on the Prosperdataset

Subset IOM RIOM1 00501 005662 00550 006333 00540 006184 00564 006965 00627 007146 00543 006297 00532 006888 00605 007119 00593 0070610 00546 0066411 00637 0070112 00567 0064013 00468 0056914 00519 0066315 00544 0062016 00357 0047217 00588 0071018 00607 0077419 00544 0065520 00625 00808Average 00553 00662

In order to test and verify that the conclusions obtainedfrom the above experiments are stable we consider dif-ferent investment amounts and required returns as inputparameters for portfolio selection and keep other conditionsunchanged As summarized in Table 3 we consider nineparameters pairs about required return rate 119877lowast and invest-ment amount M

The computational results for each parameters pair aresummarized in Table 4 Table 4 shows performance compar-ison of the two optimal portfolios from the perspectives ofactual return rate of investment The more intuitive resultsare shown in Figure 2 which shows the actual return ratecomparison of the two models The first 9 numbers ofthe horizontal axis in Figure 2 represent the correspondingparameters combinations (sets 1 through 9 fromTable 3) and

Table 3 Investorsrsquo choices of input parameters for portfolio selec-tion

Set Investment amountM Required rate 119877lowast1 $10000 502 $10000 553 $10000 604 $15000 505 $15000 556 $15000 607 $20000 508 $20000 559 $20000 60

the number 10 shows the average We can find that the RIOMmodel outperforms the IOMmodel comprehensively

In conclusion the optimal portfolio identified from therobust optimization model in this study is more efficient thanthe existing model And the performance of our model ismore robust and stable

6 Conclusions

In this paper we formulate a data-driven robust modelof portfolio optimization with relative entropy constraintsbased on an instance-based credit risk assessment frameworkfor investment decisions in P2P lending This P2P lendinginvestment decision model has at least three advantagesFirstly it provides a more refined measure of P2P loansrsquo riskand reveals a more intuitive and quantized risk estimate toinvestors instead of just labelling each loan with a creditgrade Secondly this model can estimate each loanrsquos expectedreturn and risk when the historical observation of the sameborrower is unavailable Finally this model considers theloansrsquo distribution ambiguity (probability measure uncer-tainty) problem and uses relative entropy tomodel parameteruncertainty to ensure the optimal allocation strategy effi-cient and feasible under various actual scenarios Numericalexperiments imply that the P2P lending investment decisionmodel using the robust optimization with relative entropyconstraints provides better performance than existing model

8 Mathematical Problems in Engineering

Table4Investm

entp

erform

anceso

finp

utparametersfor

portfolio

selection

Subset

119877lowast=5

119877lowast

=55

119877lowast

=6

119877lowast=5

119877lowast

=55

119877lowast

=6

119877lowast

=5

119877lowast =

55

119877lowast =

6

M=10000

M=10000

M=10000

M=15000

M=15000

M=15000

M=20000

M=20000

M=20000

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

100598

007

27006

0100762

00502

007

2200501

005

6600558

008

3900520

008

3300594

007

7400544

006

8900691

006

492

00500

005

92006

01007

79006

75008

5100550

006

3300517

006

2000551

006

4800504

005

84006

64008

95006

6100769

3004

41005

7100491

006

9800735

009

3400540

006

1800598

007

7800631

006

4600503

005

6800554

007

37006

47008

694

00525

006

0200658

009

0300636

007

5400564

006

9600512

006

2900553

008

4700566

006

4800617

008

5600518

006

355

00532

006

2000631

008

2900513

007

3100627

007

1400566

007

2100616

008

9600576

007

0900547

006

9200610

007

716

00634

00747

00564

00762

00717

01105

00543

006

2900570

007

7200585

008

7400584

00 6

8 300528

007

0100516

003

857

00613

007

3600547

007

5400551

008

8400532

006

8800528

007

2700620

005

66004

81005

49004

85006

31004

60007

598

00529

005

9400505

006

16006

85008

58006

05007

1100545

00768

006

45008

5500545

007

0100628

008

2300592

008

459

00548

006

4700550

007

3600559

004

9300593

007

0600507

005

7600574

01214

00535

005

8600561

00764

00574

01038

10004

74005

74004

72006

3400499

007

9400546

006

6400528

006

3400622

006

3100514

005

9700582

006

83006

8900532

1100597

007

30006

02007

95006

6101090

00637

007

0100562

007

5600498

006

6200531

005

8400569

006

5700572

01141

12006

4400768

00541

006

7300624

01 042

00567

006

4000529

006

7700574

009

8300551

006

7800536

007

3400618

006

8713

00635

007

8500709

008

9500532

006

62004

68005

6900637

008

8000504

009

1300555

006

9500636

008

2400616

01157

1400593

00744

00626

007

5100634

01204

00519

006

6300568

007

1600614

01162

00577

006

7400541

006

3600572

008

1815

00523

006

36004

85006

0900571

009

8700544

006

2000577

00764

00633

008

0200597

007

7500536

007

0600595

007

0416

00549

007

05006

84008

9300508

01264

00357

004

72006

42008

5100573

005

4900593

006

9800616

008

0700551

00748

1700549

006

6600549

007

5700538

006

7700588

007

10006

74008

6700615

004

9600535

006

41004

87006

3600 69 6

009

1518

00546

006

2900512

006

1500560

006

10006

07007

7400585

007

32006

87008

2400599

007

2900576

00746

00507

01069

1900492

005

5500572

006

8500657

004

3600544

006

5500434

006

3300589

007

5900581

006

73004

72006

3800623

01148

2000554

006

45004

13005

0400596

003

6600625

008

0800562

006

8700698

009

7800518

007

09006

01007

3400638

00744

Average

00554

006

6400566

007

3200598

008

2300553

006

6200560

007

3100595

008

0700552

006

6300564

007

2900597

008

19

Mathematical Problems in Engineering 9

Data Availability

The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo

Acknowledgments

The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above

References

[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952

[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959

[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002

[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002

[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016

[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016

[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014

[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011

[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968

[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970

[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980

[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983

[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011

[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007

[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000

[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006

[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007

[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015

[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018

[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009

[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011

[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010

[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010

[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014

[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001

[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007

[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018

[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018

[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007

[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004

[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013

[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965

10 Mathematical Problems in Engineering

[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015

[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017

[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018

[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012

[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012

[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Mathematical Problems in Engineering 5

where 120583 V 120583119876 and 119881119876 carry the same meaning as in (8) and(9) tr(V) |119881| and V be the trace the determinant and thetranspose of V respectively n is the amount of assets in theportfolio

Let U denote the set of parameters (120583119876 119881119876) under themeasure Q in Q Using the constraint of relative entropy wecan rewrite the robust optimization model (9) as

(RMV-RE) min120582

max(120583119876119881119876)isinU

120582T119881119876120582st min

(120583119876119881119876)isinU120582T120583119876 ge 119877lowast

119863119870119871 (119876 119875) le 119870120582 isin Ω

(12)

where K is a positive constant and determines the size ofuncertainty set Parameter K measures the level of uncer-tainty and reflects the investorsrsquo confidence in 120583 and Vestimated under probability measure P ie the greater Krsquosvalue the less confidence

Yam et al [6] prove that the robustmean-variance portfo-lio selection model based on relative entropy method (RMV-RE) can be formulated as quadratic optimization problemwhich is a tractable formulation and can be efficiently solvedThat is

min120582isinR119899

120582T119881 lowast 120582st 120582T120583lowast ge 119877lowast

120582 isin Ω(13)

Herein 120583lowast=120577120583 Vlowast=V+120577(1-120577)120583120583T and 120577 isin (0 1] is relatedto K in (12) closely which reflects the level of confidencein 120583 and V estimated under measure P For example 120577=1means that investors believe the estimated 120583 and V are thetrue parameters And as 120577 decreases the investorrsquos confidenceis weaker The details of the proof are referred to by Yam et al[6]

42 Robust Mean-Variance Portfolio Optimization Model inP2P Lending In the Section 32 we estimated each loanrsquosexpected return and variance of return ie 120583119894 and 120590119894 usingthe instance-based credit risk assessment model Let 120583 =(1205831 1205832 120583119899)T and

=[[[[[[[[[

1 0 00 2 d

d d 00 0 120590119899

]]]]]]]]]

(14)

denote the expected return vector and the covariance matrixof the loansrsquo returns under the probability measure P Herewe assume that the correlation between P2P loans is negligi-ble Now we can rewrite (13) as

Table 1 Description of variables

Variable DescriptionX1 FICO score of the borrower

X2The number of inquiries of the borrower in the last 6

monthsX3 Themonetary amount of the loan

X4The homeownership status of the borrower (0 = rent 1

= own)X5 The debt-to-income ratio of the borrowerX6 The number of accounts delinquentX7 The number of public records in the past 10 yearsY Dependent variable (0 = completed 1 = default)

min120582isinR119899

120582T ( + 120577 (1 minus 120577) 120583120583119879) 120582st 120582T (120577120583) ge 119877lowast

120582 isin Ω(15)

The feasible region Ω of our problem is defined by thefollowing constraints

(1) The value of the portfolio remains at its initial valueiesum119894 120582119894 = 1

(2) Short-selling is forbidden thus 120582119894 ge 0(3) For each loan the amount that lender can invest is

no more than the borrower request mi thereby 120582119894Mle mi where M is the total investment amount andinvestor has available

5 Empirical Analysis

In this section we investigate the validity of the robustmean-variance portfolio optimization model in P2P lending usingthe real-world dataset from a notable P2P lending platformProsper All numerical experiments are performed by usingMATLAB on PC

51 Data Description and Preprocess The dataset for empir-ical study is from a notable P2P lending platform in theUnited States Prosper It consists of 17001 loans including3039 default loans and 13908 completed loans whose issuedates within the period from November 2005 to March 2014

Using the data a credit scoring model is learnt to trans-form the loan attributes into the default probability The loanattributes are as follows the borrowerrsquos FICO score whichreflects borrowerrsquos creditworthiness the borrowerrsquos numberof inquiries in the past six months the monetary amountof the loan the homeownership status of the borrowerthe debt-to-income ratio of the borrower the borrowerrsquoscurrent delinquencies representing the number of accountsdelinquent and the borrowerrsquos number of public records inthe past 10 years (Row 1-7 in Table 1) The target variable isa binary variable (0 represents completed and 1 representsdefault) as described in Row 8 of Table 1

6 Mathematical Problems in Engineering

009500955

009600965

009700975

009800985

009900995

01

CV (h

)002 004 006 008 01 012 014 016 018 020

h

Figure 1 The curve of CV (h)

There exist many credit scoring models to predict thedefault probability of a loan such as Xgboost model [33ndash35] hybrid KMVmodel [36] credit scoring based on geneticalgorithms [37 38] and so on However discussing how tochoose and construct the optimal credit scoring model isbeyond the scope of this study and we use the most popularmodel logistic regression to make the prediction in thispreprocessing step

We randomly divide the dataset into two parts onecontaining 40 of all loans for determining the optimalbandwidth h in (5) which will be described in detail inSection 53 and the second part containing 60 of the loansMoreover using k-fold cross-validation we randomly dividethe second part into 20 subsets each of which containsapproximately 510 loans In each round one of the subsetsis used as the testing set which consists of loans waiting tobe invested thus their pay-back statuses are unknown andall other subsets are taken as a training set which consists ofhistorical loans with known yield

52 Model Description In this paper we propose a robustcredit portfolio optimization model for investment decisionsin P2P lending In order to show its effectiveness we compareit with a benchmark model proposed by Guo et al [5] In thefollowing we describe models in detail

IOM is the instance-based model proposed by Guo etal [5] Each loan is assessed using kernel weights and thehistorical performance of similar loans Then use the classicalmean-variance model (8) to identify the optimal allocationstrategy The performance of this model outperforms somerating-based models as the results of Guo et al [5] show

RIOM is the robust instance-based model in this studyExpected return and risk of each loan are also assessed basedon the ldquoinstance-basedrdquo assessment framework However weuse the robust model of credit portfolio optimization basedon relative entropy method Equation (15) to obtain theoptimal investment decision

We compare the two models by the following procedure(1) Train the credit risk assessment model with the

training set and use the trained model to predict theexpected return (120583119894) and variance (120590119894) of each loan inthe testing set Thus the expected return vector andthe covariance matrix 120583 and V can be obtained

(2) For each model feed the predicted expected returnvector 120583 and the covariance matrix 119881 of the testingloans into the portfolio optimization algorithm andcompute the performance of investment on the opti-mal portfolio

(3) Compare the return rate of the two models

53 Analysis of Results As mentioned before we select theGaussian kernel 119870(120577) = (1radic2120587)119890minus12057722 as the kernel func-tion And the important parameter in the kernel regressionmodel bandwidth h is optimized by the following leave-one-out cross validation

ℎ119900119901119905119894119898119886119897 = argminℎ119862119881 (ℎ)

= argminℎ

119899sum119894=1

(120583ℎ (119901minus119894) minus 120583119894)2 (16)

where 120583ℎ(119901minus119894) is the leave-one-out estimation of expectedreturn rate 120583119894 specifically

120583ℎ (119901minus119894) =119899sum119895=1119895 =119894

[[

119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119895 =119894119870((119901119894 minus 119901119895) ℎ) sdot 119877119895

]] (17)

The curve of CV(h) is exhibited in Figure 1 The shape of thecurve clearly shows a minimal point and h corresponding tothe minimal point is the optimal bandwith for the model

To apply the robust credit portfolio optimization methodto obtain the optimal investment strategy in problems (13)we select the parameter 120577=075 the investment amount M =15 thousand dollars and the required rate of return 119877lowast = 005We also set the risk-free return rate as 0025 which is aboutequivalent to the average yield of T-Bills over the sameperiodAnd we use the MATLAB built-in solver ldquoquaprogrdquo to solvethe two portfolio optimization problems

Table 2 summarizes investment return rate of each testsubset and the average performance of the Prosper dataset Itshows that the two portfolios are almost always efficient andfeasible except subset 16The results also show that the actualperformances of the optimal portfolio derived from RIOMalways outperform the optimal portfolio from IOM Andthe Sharpe ratio shows that median-based optimal portfolioperforms better as well

Mathematical Problems in Engineering 7

1 1098765432The number of parameters set

IOMRIOM

0001002003004005006007008009

Retu

rn ra

te o

f inv

estm

ent

Figure 2 Performance comparison

Table 2 Rate of return from the optimal portfolio on the Prosperdataset

Subset IOM RIOM1 00501 005662 00550 006333 00540 006184 00564 006965 00627 007146 00543 006297 00532 006888 00605 007119 00593 0070610 00546 0066411 00637 0070112 00567 0064013 00468 0056914 00519 0066315 00544 0062016 00357 0047217 00588 0071018 00607 0077419 00544 0065520 00625 00808Average 00553 00662

In order to test and verify that the conclusions obtainedfrom the above experiments are stable we consider dif-ferent investment amounts and required returns as inputparameters for portfolio selection and keep other conditionsunchanged As summarized in Table 3 we consider nineparameters pairs about required return rate 119877lowast and invest-ment amount M

The computational results for each parameters pair aresummarized in Table 4 Table 4 shows performance compar-ison of the two optimal portfolios from the perspectives ofactual return rate of investment The more intuitive resultsare shown in Figure 2 which shows the actual return ratecomparison of the two models The first 9 numbers ofthe horizontal axis in Figure 2 represent the correspondingparameters combinations (sets 1 through 9 fromTable 3) and

Table 3 Investorsrsquo choices of input parameters for portfolio selec-tion

Set Investment amountM Required rate 119877lowast1 $10000 502 $10000 553 $10000 604 $15000 505 $15000 556 $15000 607 $20000 508 $20000 559 $20000 60

the number 10 shows the average We can find that the RIOMmodel outperforms the IOMmodel comprehensively

In conclusion the optimal portfolio identified from therobust optimization model in this study is more efficient thanthe existing model And the performance of our model ismore robust and stable

6 Conclusions

In this paper we formulate a data-driven robust modelof portfolio optimization with relative entropy constraintsbased on an instance-based credit risk assessment frameworkfor investment decisions in P2P lending This P2P lendinginvestment decision model has at least three advantagesFirstly it provides a more refined measure of P2P loansrsquo riskand reveals a more intuitive and quantized risk estimate toinvestors instead of just labelling each loan with a creditgrade Secondly this model can estimate each loanrsquos expectedreturn and risk when the historical observation of the sameborrower is unavailable Finally this model considers theloansrsquo distribution ambiguity (probability measure uncer-tainty) problem and uses relative entropy tomodel parameteruncertainty to ensure the optimal allocation strategy effi-cient and feasible under various actual scenarios Numericalexperiments imply that the P2P lending investment decisionmodel using the robust optimization with relative entropyconstraints provides better performance than existing model

8 Mathematical Problems in Engineering

Table4Investm

entp

erform

anceso

finp

utparametersfor

portfolio

selection

Subset

119877lowast=5

119877lowast

=55

119877lowast

=6

119877lowast=5

119877lowast

=55

119877lowast

=6

119877lowast

=5

119877lowast =

55

119877lowast =

6

M=10000

M=10000

M=10000

M=15000

M=15000

M=15000

M=20000

M=20000

M=20000

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

100598

007

27006

0100762

00502

007

2200501

005

6600558

008

3900520

008

3300594

007

7400544

006

8900691

006

492

00500

005

92006

01007

79006

75008

5100550

006

3300517

006

2000551

006

4800504

005

84006

64008

95006

6100769

3004

41005

7100491

006

9800735

009

3400540

006

1800598

007

7800631

006

4600503

005

6800554

007

37006

47008

694

00525

006

0200658

009

0300636

007

5400564

006

9600512

006

2900553

008

4700566

006

4800617

008

5600518

006

355

00532

006

2000631

008

2900513

007

3100627

007

1400566

007

2100616

008

9600576

007

0900547

006

9200610

007

716

00634

00747

00564

00762

00717

01105

00543

006

2900570

007

7200585

008

7400584

00 6

8 300528

007

0100516

003

857

00613

007

3600547

007

5400551

008

8400532

006

8800528

007

2700620

005

66004

81005

49004

85006

31004

60007

598

00529

005

9400505

006

16006

85008

58006

05007

1100545

00768

006

45008

5500545

007

0100628

008

2300592

008

459

00548

006

4700550

007

3600559

004

9300593

007

0600507

005

7600574

01214

00535

005

8600561

00764

00574

01038

10004

74005

74004

72006

3400499

007

9400546

006

6400528

006

3400622

006

3100514

005

9700582

006

83006

8900532

1100597

007

30006

02007

95006

6101090

00637

007

0100562

007

5600498

006

6200531

005

8400569

006

5700572

01141

12006

4400768

00541

006

7300624

01 042

00567

006

4000529

006

7700574

009

8300551

006

7800536

007

3400618

006

8713

00635

007

8500709

008

9500532

006

62004

68005

6900637

008

8000504

009

1300555

006

9500636

008

2400616

01157

1400593

00744

00626

007

5100634

01204

00519

006

6300568

007

1600614

01162

00577

006

7400541

006

3600572

008

1815

00523

006

36004

85006

0900571

009

8700544

006

2000577

00764

00633

008

0200597

007

7500536

007

0600595

007

0416

00549

007

05006

84008

9300508

01264

00357

004

72006

42008

5100573

005

4900593

006

9800616

008

0700551

00748

1700549

006

6600549

007

5700538

006

7700588

007

10006

74008

6700615

004

9600535

006

41004

87006

3600 69 6

009

1518

00546

006

2900512

006

1500560

006

10006

07007

7400585

007

32006

87008

2400599

007

2900576

00746

00507

01069

1900492

005

5500572

006

8500657

004

3600544

006

5500434

006

3300589

007

5900581

006

73004

72006

3800623

01148

2000554

006

45004

13005

0400596

003

6600625

008

0800562

006

8700698

009

7800518

007

09006

01007

3400638

00744

Average

00554

006

6400566

007

3200598

008

2300553

006

6200560

007

3100595

008

0700552

006

6300564

007

2900597

008

19

Mathematical Problems in Engineering 9

Data Availability

The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo

Acknowledgments

The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above

References

[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952

[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959

[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002

[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002

[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016

[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016

[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014

[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011

[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968

[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970

[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980

[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983

[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011

[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007

[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000

[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006

[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007

[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015

[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018

[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009

[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011

[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010

[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010

[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014

[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001

[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007

[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018

[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018

[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007

[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004

[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013

[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965

10 Mathematical Problems in Engineering

[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015

[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017

[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018

[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012

[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012

[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

6 Mathematical Problems in Engineering

009500955

009600965

009700975

009800985

009900995

01

CV (h

)002 004 006 008 01 012 014 016 018 020

h

Figure 1 The curve of CV (h)

There exist many credit scoring models to predict thedefault probability of a loan such as Xgboost model [33ndash35] hybrid KMVmodel [36] credit scoring based on geneticalgorithms [37 38] and so on However discussing how tochoose and construct the optimal credit scoring model isbeyond the scope of this study and we use the most popularmodel logistic regression to make the prediction in thispreprocessing step

We randomly divide the dataset into two parts onecontaining 40 of all loans for determining the optimalbandwidth h in (5) which will be described in detail inSection 53 and the second part containing 60 of the loansMoreover using k-fold cross-validation we randomly dividethe second part into 20 subsets each of which containsapproximately 510 loans In each round one of the subsetsis used as the testing set which consists of loans waiting tobe invested thus their pay-back statuses are unknown andall other subsets are taken as a training set which consists ofhistorical loans with known yield

52 Model Description In this paper we propose a robustcredit portfolio optimization model for investment decisionsin P2P lending In order to show its effectiveness we compareit with a benchmark model proposed by Guo et al [5] In thefollowing we describe models in detail

IOM is the instance-based model proposed by Guo etal [5] Each loan is assessed using kernel weights and thehistorical performance of similar loans Then use the classicalmean-variance model (8) to identify the optimal allocationstrategy The performance of this model outperforms somerating-based models as the results of Guo et al [5] show

RIOM is the robust instance-based model in this studyExpected return and risk of each loan are also assessed basedon the ldquoinstance-basedrdquo assessment framework However weuse the robust model of credit portfolio optimization basedon relative entropy method Equation (15) to obtain theoptimal investment decision

We compare the two models by the following procedure(1) Train the credit risk assessment model with the

training set and use the trained model to predict theexpected return (120583119894) and variance (120590119894) of each loan inthe testing set Thus the expected return vector andthe covariance matrix 120583 and V can be obtained

(2) For each model feed the predicted expected returnvector 120583 and the covariance matrix 119881 of the testingloans into the portfolio optimization algorithm andcompute the performance of investment on the opti-mal portfolio

(3) Compare the return rate of the two models

53 Analysis of Results As mentioned before we select theGaussian kernel 119870(120577) = (1radic2120587)119890minus12057722 as the kernel func-tion And the important parameter in the kernel regressionmodel bandwidth h is optimized by the following leave-one-out cross validation

ℎ119900119901119905119894119898119886119897 = argminℎ119862119881 (ℎ)

= argminℎ

119899sum119894=1

(120583ℎ (119901minus119894) minus 120583119894)2 (16)

where 120583ℎ(119901minus119894) is the leave-one-out estimation of expectedreturn rate 120583119894 specifically

120583ℎ (119901minus119894) =119899sum119895=1119895 =119894

[[

119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119895 =119894119870((119901119894 minus 119901119895) ℎ) sdot 119877119895

]] (17)

The curve of CV(h) is exhibited in Figure 1 The shape of thecurve clearly shows a minimal point and h corresponding tothe minimal point is the optimal bandwith for the model

To apply the robust credit portfolio optimization methodto obtain the optimal investment strategy in problems (13)we select the parameter 120577=075 the investment amount M =15 thousand dollars and the required rate of return 119877lowast = 005We also set the risk-free return rate as 0025 which is aboutequivalent to the average yield of T-Bills over the sameperiodAnd we use the MATLAB built-in solver ldquoquaprogrdquo to solvethe two portfolio optimization problems

Table 2 summarizes investment return rate of each testsubset and the average performance of the Prosper dataset Itshows that the two portfolios are almost always efficient andfeasible except subset 16The results also show that the actualperformances of the optimal portfolio derived from RIOMalways outperform the optimal portfolio from IOM Andthe Sharpe ratio shows that median-based optimal portfolioperforms better as well

Mathematical Problems in Engineering 7

1 1098765432The number of parameters set

IOMRIOM

0001002003004005006007008009

Retu

rn ra

te o

f inv

estm

ent

Figure 2 Performance comparison

Table 2 Rate of return from the optimal portfolio on the Prosperdataset

Subset IOM RIOM1 00501 005662 00550 006333 00540 006184 00564 006965 00627 007146 00543 006297 00532 006888 00605 007119 00593 0070610 00546 0066411 00637 0070112 00567 0064013 00468 0056914 00519 0066315 00544 0062016 00357 0047217 00588 0071018 00607 0077419 00544 0065520 00625 00808Average 00553 00662

In order to test and verify that the conclusions obtainedfrom the above experiments are stable we consider dif-ferent investment amounts and required returns as inputparameters for portfolio selection and keep other conditionsunchanged As summarized in Table 3 we consider nineparameters pairs about required return rate 119877lowast and invest-ment amount M

The computational results for each parameters pair aresummarized in Table 4 Table 4 shows performance compar-ison of the two optimal portfolios from the perspectives ofactual return rate of investment The more intuitive resultsare shown in Figure 2 which shows the actual return ratecomparison of the two models The first 9 numbers ofthe horizontal axis in Figure 2 represent the correspondingparameters combinations (sets 1 through 9 fromTable 3) and

Table 3 Investorsrsquo choices of input parameters for portfolio selec-tion

Set Investment amountM Required rate 119877lowast1 $10000 502 $10000 553 $10000 604 $15000 505 $15000 556 $15000 607 $20000 508 $20000 559 $20000 60

the number 10 shows the average We can find that the RIOMmodel outperforms the IOMmodel comprehensively

In conclusion the optimal portfolio identified from therobust optimization model in this study is more efficient thanthe existing model And the performance of our model ismore robust and stable

6 Conclusions

In this paper we formulate a data-driven robust modelof portfolio optimization with relative entropy constraintsbased on an instance-based credit risk assessment frameworkfor investment decisions in P2P lending This P2P lendinginvestment decision model has at least three advantagesFirstly it provides a more refined measure of P2P loansrsquo riskand reveals a more intuitive and quantized risk estimate toinvestors instead of just labelling each loan with a creditgrade Secondly this model can estimate each loanrsquos expectedreturn and risk when the historical observation of the sameborrower is unavailable Finally this model considers theloansrsquo distribution ambiguity (probability measure uncer-tainty) problem and uses relative entropy tomodel parameteruncertainty to ensure the optimal allocation strategy effi-cient and feasible under various actual scenarios Numericalexperiments imply that the P2P lending investment decisionmodel using the robust optimization with relative entropyconstraints provides better performance than existing model

8 Mathematical Problems in Engineering

Table4Investm

entp

erform

anceso

finp

utparametersfor

portfolio

selection

Subset

119877lowast=5

119877lowast

=55

119877lowast

=6

119877lowast=5

119877lowast

=55

119877lowast

=6

119877lowast

=5

119877lowast =

55

119877lowast =

6

M=10000

M=10000

M=10000

M=15000

M=15000

M=15000

M=20000

M=20000

M=20000

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

100598

007

27006

0100762

00502

007

2200501

005

6600558

008

3900520

008

3300594

007

7400544

006

8900691

006

492

00500

005

92006

01007

79006

75008

5100550

006

3300517

006

2000551

006

4800504

005

84006

64008

95006

6100769

3004

41005

7100491

006

9800735

009

3400540

006

1800598

007

7800631

006

4600503

005

6800554

007

37006

47008

694

00525

006

0200658

009

0300636

007

5400564

006

9600512

006

2900553

008

4700566

006

4800617

008

5600518

006

355

00532

006

2000631

008

2900513

007

3100627

007

1400566

007

2100616

008

9600576

007

0900547

006

9200610

007

716

00634

00747

00564

00762

00717

01105

00543

006

2900570

007

7200585

008

7400584

00 6

8 300528

007

0100516

003

857

00613

007

3600547

007

5400551

008

8400532

006

8800528

007

2700620

005

66004

81005

49004

85006

31004

60007

598

00529

005

9400505

006

16006

85008

58006

05007

1100545

00768

006

45008

5500545

007

0100628

008

2300592

008

459

00548

006

4700550

007

3600559

004

9300593

007

0600507

005

7600574

01214

00535

005

8600561

00764

00574

01038

10004

74005

74004

72006

3400499

007

9400546

006

6400528

006

3400622

006

3100514

005

9700582

006

83006

8900532

1100597

007

30006

02007

95006

6101090

00637

007

0100562

007

5600498

006

6200531

005

8400569

006

5700572

01141

12006

4400768

00541

006

7300624

01 042

00567

006

4000529

006

7700574

009

8300551

006

7800536

007

3400618

006

8713

00635

007

8500709

008

9500532

006

62004

68005

6900637

008

8000504

009

1300555

006

9500636

008

2400616

01157

1400593

00744

00626

007

5100634

01204

00519

006

6300568

007

1600614

01162

00577

006

7400541

006

3600572

008

1815

00523

006

36004

85006

0900571

009

8700544

006

2000577

00764

00633

008

0200597

007

7500536

007

0600595

007

0416

00549

007

05006

84008

9300508

01264

00357

004

72006

42008

5100573

005

4900593

006

9800616

008

0700551

00748

1700549

006

6600549

007

5700538

006

7700588

007

10006

74008

6700615

004

9600535

006

41004

87006

3600 69 6

009

1518

00546

006

2900512

006

1500560

006

10006

07007

7400585

007

32006

87008

2400599

007

2900576

00746

00507

01069

1900492

005

5500572

006

8500657

004

3600544

006

5500434

006

3300589

007

5900581

006

73004

72006

3800623

01148

2000554

006

45004

13005

0400596

003

6600625

008

0800562

006

8700698

009

7800518

007

09006

01007

3400638

00744

Average

00554

006

6400566

007

3200598

008

2300553

006

6200560

007

3100595

008

0700552

006

6300564

007

2900597

008

19

Mathematical Problems in Engineering 9

Data Availability

The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo

Acknowledgments

The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above

References

[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952

[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959

[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002

[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002

[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016

[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016

[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014

[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011

[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968

[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970

[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980

[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983

[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011

[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007

[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000

[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006

[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007

[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015

[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018

[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009

[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011

[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010

[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010

[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014

[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001

[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007

[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018

[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018

[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007

[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004

[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013

[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965

10 Mathematical Problems in Engineering

[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015

[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017

[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018

[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012

[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012

[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Mathematical Problems in Engineering 7

1 1098765432The number of parameters set

IOMRIOM

0001002003004005006007008009

Retu

rn ra

te o

f inv

estm

ent

Figure 2 Performance comparison

Table 2 Rate of return from the optimal portfolio on the Prosperdataset

Subset IOM RIOM1 00501 005662 00550 006333 00540 006184 00564 006965 00627 007146 00543 006297 00532 006888 00605 007119 00593 0070610 00546 0066411 00637 0070112 00567 0064013 00468 0056914 00519 0066315 00544 0062016 00357 0047217 00588 0071018 00607 0077419 00544 0065520 00625 00808Average 00553 00662

In order to test and verify that the conclusions obtainedfrom the above experiments are stable we consider dif-ferent investment amounts and required returns as inputparameters for portfolio selection and keep other conditionsunchanged As summarized in Table 3 we consider nineparameters pairs about required return rate 119877lowast and invest-ment amount M

The computational results for each parameters pair aresummarized in Table 4 Table 4 shows performance compar-ison of the two optimal portfolios from the perspectives ofactual return rate of investment The more intuitive resultsare shown in Figure 2 which shows the actual return ratecomparison of the two models The first 9 numbers ofthe horizontal axis in Figure 2 represent the correspondingparameters combinations (sets 1 through 9 fromTable 3) and

Table 3 Investorsrsquo choices of input parameters for portfolio selec-tion

Set Investment amountM Required rate 119877lowast1 $10000 502 $10000 553 $10000 604 $15000 505 $15000 556 $15000 607 $20000 508 $20000 559 $20000 60

the number 10 shows the average We can find that the RIOMmodel outperforms the IOMmodel comprehensively

In conclusion the optimal portfolio identified from therobust optimization model in this study is more efficient thanthe existing model And the performance of our model ismore robust and stable

6 Conclusions

In this paper we formulate a data-driven robust modelof portfolio optimization with relative entropy constraintsbased on an instance-based credit risk assessment frameworkfor investment decisions in P2P lending This P2P lendinginvestment decision model has at least three advantagesFirstly it provides a more refined measure of P2P loansrsquo riskand reveals a more intuitive and quantized risk estimate toinvestors instead of just labelling each loan with a creditgrade Secondly this model can estimate each loanrsquos expectedreturn and risk when the historical observation of the sameborrower is unavailable Finally this model considers theloansrsquo distribution ambiguity (probability measure uncer-tainty) problem and uses relative entropy tomodel parameteruncertainty to ensure the optimal allocation strategy effi-cient and feasible under various actual scenarios Numericalexperiments imply that the P2P lending investment decisionmodel using the robust optimization with relative entropyconstraints provides better performance than existing model

8 Mathematical Problems in Engineering

Table4Investm

entp

erform

anceso

finp

utparametersfor

portfolio

selection

Subset

119877lowast=5

119877lowast

=55

119877lowast

=6

119877lowast=5

119877lowast

=55

119877lowast

=6

119877lowast

=5

119877lowast =

55

119877lowast =

6

M=10000

M=10000

M=10000

M=15000

M=15000

M=15000

M=20000

M=20000

M=20000

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

100598

007

27006

0100762

00502

007

2200501

005

6600558

008

3900520

008

3300594

007

7400544

006

8900691

006

492

00500

005

92006

01007

79006

75008

5100550

006

3300517

006

2000551

006

4800504

005

84006

64008

95006

6100769

3004

41005

7100491

006

9800735

009

3400540

006

1800598

007

7800631

006

4600503

005

6800554

007

37006

47008

694

00525

006

0200658

009

0300636

007

5400564

006

9600512

006

2900553

008

4700566

006

4800617

008

5600518

006

355

00532

006

2000631

008

2900513

007

3100627

007

1400566

007

2100616

008

9600576

007

0900547

006

9200610

007

716

00634

00747

00564

00762

00717

01105

00543

006

2900570

007

7200585

008

7400584

00 6

8 300528

007

0100516

003

857

00613

007

3600547

007

5400551

008

8400532

006

8800528

007

2700620

005

66004

81005

49004

85006

31004

60007

598

00529

005

9400505

006

16006

85008

58006

05007

1100545

00768

006

45008

5500545

007

0100628

008

2300592

008

459

00548

006

4700550

007

3600559

004

9300593

007

0600507

005

7600574

01214

00535

005

8600561

00764

00574

01038

10004

74005

74004

72006

3400499

007

9400546

006

6400528

006

3400622

006

3100514

005

9700582

006

83006

8900532

1100597

007

30006

02007

95006

6101090

00637

007

0100562

007

5600498

006

6200531

005

8400569

006

5700572

01141

12006

4400768

00541

006

7300624

01 042

00567

006

4000529

006

7700574

009

8300551

006

7800536

007

3400618

006

8713

00635

007

8500709

008

9500532

006

62004

68005

6900637

008

8000504

009

1300555

006

9500636

008

2400616

01157

1400593

00744

00626

007

5100634

01204

00519

006

6300568

007

1600614

01162

00577

006

7400541

006

3600572

008

1815

00523

006

36004

85006

0900571

009

8700544

006

2000577

00764

00633

008

0200597

007

7500536

007

0600595

007

0416

00549

007

05006

84008

9300508

01264

00357

004

72006

42008

5100573

005

4900593

006

9800616

008

0700551

00748

1700549

006

6600549

007

5700538

006

7700588

007

10006

74008

6700615

004

9600535

006

41004

87006

3600 69 6

009

1518

00546

006

2900512

006

1500560

006

10006

07007

7400585

007

32006

87008

2400599

007

2900576

00746

00507

01069

1900492

005

5500572

006

8500657

004

3600544

006

5500434

006

3300589

007

5900581

006

73004

72006

3800623

01148

2000554

006

45004

13005

0400596

003

6600625

008

0800562

006

8700698

009

7800518

007

09006

01007

3400638

00744

Average

00554

006

6400566

007

3200598

008

2300553

006

6200560

007

3100595

008

0700552

006

6300564

007

2900597

008

19

Mathematical Problems in Engineering 9

Data Availability

The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo

Acknowledgments

The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above

References

[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952

[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959

[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002

[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002

[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016

[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016

[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014

[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011

[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968

[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970

[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980

[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983

[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011

[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007

[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000

[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006

[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007

[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015

[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018

[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009

[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011

[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010

[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010

[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014

[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001

[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007

[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018

[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018

[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007

[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004

[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013

[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965

10 Mathematical Problems in Engineering

[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015

[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017

[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018

[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012

[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012

[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

8 Mathematical Problems in Engineering

Table4Investm

entp

erform

anceso

finp

utparametersfor

portfolio

selection

Subset

119877lowast=5

119877lowast

=55

119877lowast

=6

119877lowast=5

119877lowast

=55

119877lowast

=6

119877lowast

=5

119877lowast =

55

119877lowast =

6

M=10000

M=10000

M=10000

M=15000

M=15000

M=15000

M=20000

M=20000

M=20000

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

IOM

RIOM

100598

007

27006

0100762

00502

007

2200501

005

6600558

008

3900520

008

3300594

007

7400544

006

8900691

006

492

00500

005

92006

01007

79006

75008

5100550

006

3300517

006

2000551

006

4800504

005

84006

64008

95006

6100769

3004

41005

7100491

006

9800735

009

3400540

006

1800598

007

7800631

006

4600503

005

6800554

007

37006

47008

694

00525

006

0200658

009

0300636

007

5400564

006

9600512

006

2900553

008

4700566

006

4800617

008

5600518

006

355

00532

006

2000631

008

2900513

007

3100627

007

1400566

007

2100616

008

9600576

007

0900547

006

9200610

007

716

00634

00747

00564

00762

00717

01105

00543

006

2900570

007

7200585

008

7400584

00 6

8 300528

007

0100516

003

857

00613

007

3600547

007

5400551

008

8400532

006

8800528

007

2700620

005

66004

81005

49004

85006

31004

60007

598

00529

005

9400505

006

16006

85008

58006

05007

1100545

00768

006

45008

5500545

007

0100628

008

2300592

008

459

00548

006

4700550

007

3600559

004

9300593

007

0600507

005

7600574

01214

00535

005

8600561

00764

00574

01038

10004

74005

74004

72006

3400499

007

9400546

006

6400528

006

3400622

006

3100514

005

9700582

006

83006

8900532

1100597

007

30006

02007

95006

6101090

00637

007

0100562

007

5600498

006

6200531

005

8400569

006

5700572

01141

12006

4400768

00541

006

7300624

01 042

00567

006

4000529

006

7700574

009

8300551

006

7800536

007

3400618

006

8713

00635

007

8500709

008

9500532

006

62004

68005

6900637

008

8000504

009

1300555

006

9500636

008

2400616

01157

1400593

00744

00626

007

5100634

01204

00519

006

6300568

007

1600614

01162

00577

006

7400541

006

3600572

008

1815

00523

006

36004

85006

0900571

009

8700544

006

2000577

00764

00633

008

0200597

007

7500536

007

0600595

007

0416

00549

007

05006

84008

9300508

01264

00357

004

72006

42008

5100573

005

4900593

006

9800616

008

0700551

00748

1700549

006

6600549

007

5700538

006

7700588

007

10006

74008

6700615

004

9600535

006

41004

87006

3600 69 6

009

1518

00546

006

2900512

006

1500560

006

10006

07007

7400585

007

32006

87008

2400599

007

2900576

00746

00507

01069

1900492

005

5500572

006

8500657

004

3600544

006

5500434

006

3300589

007

5900581

006

73004

72006

3800623

01148

2000554

006

45004

13005

0400596

003

6600625

008

0800562

006

8700698

009

7800518

007

09006

01007

3400638

00744

Average

00554

006

6400566

007

3200598

008

2300553

006

6200560

007

3100595

008

0700552

006

6300564

007

2900597

008

19

Mathematical Problems in Engineering 9

Data Availability

The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo

Acknowledgments

The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above

References

[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952

[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959

[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002

[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002

[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016

[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016

[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014

[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011

[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968

[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970

[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980

[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983

[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011

[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007

[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000

[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006

[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007

[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015

[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018

[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009

[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011

[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010

[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010

[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014

[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001

[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007

[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018

[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018

[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007

[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004

[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013

[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965

10 Mathematical Problems in Engineering

[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015

[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017

[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018

[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012

[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012

[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Mathematical Problems in Engineering 9

Data Availability

The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo

Acknowledgments

The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above

References

[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952

[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959

[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002

[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002

[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016

[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016

[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014

[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011

[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968

[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970

[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980

[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983

[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011

[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007

[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000

[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006

[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007

[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015

[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018

[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009

[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011

[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010

[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010

[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014

[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001

[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007

[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018

[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018

[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007

[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004

[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013

[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965

10 Mathematical Problems in Engineering

[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015

[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017

[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018

[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012

[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012

[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

10 Mathematical Problems in Engineering

[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015

[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017

[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018

[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012

[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012

[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Applied MathematicsJournal of

Hindawiwwwhindawicom Volume 2018

Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of

Hindawiwwwhindawicom Volume 2018

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawiwwwhindawicom Volume 2018

OptimizationJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

Hindawiwwwhindawicom Volume 2018

Operations ResearchAdvances in

Journal of

Hindawiwwwhindawicom Volume 2018

Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences

Hindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018

Hindawiwwwhindawicom Volume 2018

Decision SciencesAdvances in

Hindawiwwwhindawicom Volume 2018

AnalysisInternational Journal of

Hindawiwwwhindawicom Volume 2018

Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom