data-driven robust credit portfolio optimization for
TRANSCRIPT
Research ArticleData-Driven Robust Credit Portfolio Optimization forInvestment Decisions in P2P Lending
Guotai Chi Shijie Ding and Xiankun Peng
Faculty of Management and Economics Dalian University of Technology Dalian 116024 China
Correspondence should be addressed to Shijie Ding ding0601126com
Received 24 October 2018 Accepted 24 December 2018 Published 2 January 2019
Academic Editor Emilio Gomez-Deniz
Copyright copy 2019 Guotai Chi et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited
Peer-to-Peer (P2P) lending has attracted increasing attention recently As an emerging micro-finance platform P2P lending playsroles in removing intermediaries reducing transaction costs and increasing the benefits of both borrowers and lenders Howeverfor the P2P lending investment there are two major challenges the deficiency of loansrsquo historical observations about the certainborrower and the ambiguity problem of estimated loansrsquo distribution In order to solve the difficulties this paper proposes adata-driven robust model of portfolio optimization with relative entropy constraints based on an ldquoinstance-basedrdquo credit riskassessment frameworkThemodel exploits a nonparametric kernel approach to estimate P2P loansrsquo expected return and risk underthe condition that the historical data of the same borrower is unavailable Furthermore we construct a robust meanndashvarianceoptimization problem based on relative entropy method for P2P loan investment decision Using the real-world dataset from anotable P2P lending platform Prosper we validate the proposed model Empirical results reveal that our model provides betterinvestment performances than the existing model
1 Introduction
Peer-to-peer lending as an emerging online micro-financeprovides services that bring borrowers and lenders togethervirtually and help them to lend to and borrow fromeach otherdirectly P2P lending platforms play roles in removing tra-ditional financial intermediaries reducing transaction costsand increasing the benefits of both borrowers and lenderstherefore they improve the efficiency of financial marketHowever due to the absence of traditional financial inter-mediaries which can use collateral certified accounts andother means to enhance the creditworthiness of borrowersthe information asymmetry between borrowers and lendersseverely exist and the credit risk of P2P loan investment isvery high
Credit risk of P2P lending refers to the potentialmonetaryloss arising from the default of a borrower to a loan Efficientand reasonable investment in P2P loans needs to be basedon the reliable credit risk distribution assessment It is verychallenging to estimate the credit risk distribution of P2Ploans for the difficulty of obtaining the historical returns (or
losses) data of the loanwaiting for investment In otherwordsthe historical yield data about the same borrower is usuallyunavailable Moreover even the distribution of loansrsquo returns(or losses) is approximated from the limited available dataor the expert knowledge the approximation is usually notaccurate and it is also known as the distribution ambiguity(probability measure uncertainty) problem In this paper weformulate a data-driven robust portfolio optimization modelbased on an ldquoinstance-basedrdquo credit risk assessment methodfor investment decisions in P2P lending
To help personal lenders mitigate the risk the currentonline P2P lending platforms have taken some risk-reducingmeasures such as filtering out the high-risk borrower whoseFICO score is lower than a threshold making a preliminaryrating on each loan and providing investors with risk levelof each loan Thus each loan is marked as a grade like AAA B C D E or NR and the loans with the same grade areconsidered to have the same risk level These rating-basedmodels are more suitable for traditional banks and lendinginstitutions since they have the capability to grant largeamounts of loans to diversify their investments However the
HindawiMathematical Problems in EngineeringVolume 2019 Article ID 1902970 10 pageshttpsdoiorg10115520191902970
2 Mathematical Problems in Engineering
individual investors just possess small amount of funds theyneed more refined risk assessment methods and investmentstrategies
Similar to bond investment P2P investors can fund aportion not the whole of each loan Therefore investors candecide which loans to invest and meanwhile determine theamount of investment for each loan This mechanism allowsinvestors to construct a credit portfolio to mitigate risk
Markowitz [1] proposes the famous mean-variancemodel which is still widely used in portfolio selection andrisk management From then on researchers propose avariety of mean-risk models such as mean-downside riskmodel [2] mean-VaR model [3] mean-CVaR model [4] andso on In practice the distribution of the assets needs tobe estimated firstly and then the optimal portfolio can beidentified by the optimization model
For P2P lending investment as mentioned above suchprocedures face at least two major challenges ie the defi-ciency of loansrsquo historical observations and the ambiguityproblem of estimated loansrsquo distribution (probabilitymeasureuncertainty problem) Thus this paper proposes a data-driven robust model of portfolio optimization based onrelative entropy constraints combined with an instance-basedcredit risk assessment method
Specifically we use the ldquoinstance-basedrdquo credit riskassessment method proposed by Guo et al [5] to evaluate thereturn and risk of each loan without sufficient historical dataof loans for each individual borrower In this instance-basedframework the expected return of each loan is predictedas a weighted average of historical loans of other similarborrowers where the optimal weights are learnt based onkernel regression Furthermore using the moment informa-tion (mean and variance) of the new loans we formulatethe robust portfolio optimization model with relative entropyconstraints which could obtain an optimal portfolio undertheworst scenario and has the ability of reducing the potentialloss caused by the uncertainty of loans distribution
Our work is somewhat related to the paper by Guo et al[5] and the paper by Yam et al [6] Guo et al [5] introducethe instance-based framework into credit risk assessmentof P2P loan and use the classical mean-variance model toobtain the optimal allocation Yam et al [6] derive a robustmean-variance optimization model with relative entropyconstrains on the uncertainty of the interaction between thereturns of different assets and discuss its mathematical andfinancial properties in portfolio selection Although someother scholars have contributed novel insights into creditrisk assessment of P2P lending and robust optimizationto the best of our knowledge few have taken both intoconsideration synthetically The main contribution of thispaper is that we propose a data-driven robust portfoliooptimization model based on relative entropy constraintscombined with instance-based risk assessment frameworkfor P2P loan investment and obtain superior performance innumerical experiments
The rest of this paper is organized as follows Section 2provides the literature review Section 3 introduces theinstance-based model for credit risk assessment as well as themathematical framework of kernel regression approach In
Section 4 we elaborate the robust optimization model basedon relative entropy method and formulate a robust mean-variance optimizationmodel for P2P lending investmentTheempirical results on the effectiveness of ourmodel is reportedin Section 5 Finally Section 6 concludes this work
2 Literature Review
In order to assess risk and assist investment decisions makingin P2P lending researchers have donemany studies Emekteret al [7] explore the dominated factors that explain thefunding success and credit risk and meanwhile measure theperformance of P2P loans They find that credit grade debt-to-income ratio FICO score and revolving line utilizationplay an important role in loan defaults furthermore loanswith lower credit grade and longer duration may result inhigh mortality rate and higher interest rates charged on lowcredit grade borrowers are not sufficient to cover the potentialloss for the higher likelihood of loan defaults Thus theauthors suggest that investors should invest more to highgrade loans Similarly Berkovichrsquos [8] study finds that highquality loans offer excess return
The above researches investigate the factors determiningthe credit risk and analyze the performance of P2P loanshowever they do not propose a mechanism which assistindividual investors in allocating loans effectively andmakingoptimal investment decisions
To help personal lenders mitigate the risk the popularonline P2P platforms like Lending Club and Prosper havedeveloped credit scoring systems to assess the creditworthi-ness of each borrower based on data mining or machinelearning techniques There is a large body of existing lit-eratures concerned with credit rating using data miningtechniques for example linear discriminate analysis (LDA)[9] k-nearest neighbors [10] logistic regression [11] classifi-cation and regression trees (CART) [12] Markov chains [13]survival analysis [14] artificial neural network (ANN) [15]genetic methods [16] support vector machine (SVM) [17 18]lasso-probit [19] and so on
In the portfolio selection problem full knowledge ofthe assetsrsquo distribution is usually assumed to determine theoptimal portfolio In most real-life applications we need toapproximate the assetsrsquo distribution However the approx-imations are not necessarily accurate and it is known asthe distribution ambiguity (probability measure uncertainty)problem
The robust optimization algorithm is an attractive wayto solve the portfolio selection problem under distributionambiguity As the exact parameters are unavailable Natarajanet al [20] use a set of parameters (which represent differentdistributions or scenarios) rather than a point estimationof the parameters to formulate the asset allocation prob-lem Following this idea there are different ways to modelambiguity by using a set of parameters Chen et al [21] takethe lower partial moments and CVaR as two risk measuresand consider a tight bound which are likely to cover thepossible parameters Epstein [22] considered intervals thatmay include the actual parameters Natarajan et al [23]use a piecewise-linear concave utility function to derive
Mathematical Problems in Engineering 3
accurate and estimated optimal strategies for the expectedutility model in the portfolio optimization issue under theworst-case scenarios Pac and Pinar [24] use an ellipsoidaluncertainty set to represent the distribution ambiguity toidentify the optimal portfolio
Since relative entropy has the ability to measure thedifference between two probability distributions (probabilitymeasures) it can be used to construct the uncertainty set forrobust optimization In the studies of Hansen and Sargent[25] and Calafiore [26] relative entropy is used to modeluncertainty and obtain the optimal investment decisionYam et al [6] derive a robust mean-variance optimizationmodel with relative entropy constrains on the uncertainty ofthe interaction between the returns of different assets anddiscuss its mathematical and financial properties in portfolioselection
In recent years research ondata-drivenmethods has beenwell studied In this framework it is assumed that investorsonly possess the information about history data of assetreturn Bertsimas et al [27] use KS test 1205942 test Anderson-Darling test and some other testing tools to construct uncer-tainty sets and take the worst case of each set to formulate therobust optimization They assume that the uncertainty setsare defined by certain structures and sizes based on the datapoints available While the structure of uncertainty set in ourstudy is not predefined we consider the uncertainty of meancovariance and distribution synthetically Kang et al [28]propose a data-driven robust mean-CVaR portfolio selectionmodel under the condition of distribution ambiguity andadopt a nonparametric bootstrap approach to calibrate thelevels of ambiguity Their work is based on the mean-CVaRframeworkwith data of stock indices while our work is basedon the mean-variance framework with data of P2P loans
3 Instance-Based Model forCredit Risk Assessment
Using historical data to evaluate future performance andpotential loss is a convention However unlike bonds orstocks investment the historical yield data about the sameP2P borrower is usually unavailableThus the risk assessmentof new loan is very challenging In this section we brieflyintroduce the instance-based credit risk assessment modelproposed by Guo et al [5]
31 Instance-Based Assessment Framework In this instance-based assessment framework the expected return of eachloan is estimated as a weighted average of historical observa-tions of other borrowersrsquo closed loans Specifically for a newloan i using n past loans each with an historical return 119877119895 (j= 1 2 n) we can calculate the expected retrun of loan i 120583119894based on a weighted average of past loansrsquo actual returns
120583119894 =119899sum119895=1
119908119894119895119877119895 (1)
where 119908119894119895 denotes the weight of loan j for predicting theexpected retrun of loan i The weight depends on thesimilarity between loan i and loan j Intuitively the more
the similarity the greater the weight The calculation of theweight will be introduced in Section 32
The weighted returns of the past loans are assumed ashistorical observations of a new loan According to this lineof thought taking variance as the risk measure weightedvariance of past loans are used to assess the new loanrsquos riskthat is
1205902119894 =119899sum119895=1
119908119894119895 (119877119895 minus 120583119894)2 (2)
where119908119894119895 119877119895 and 120583119894 have the same meanings as (1)The absolute deviation between two loansrsquo default prob-
abilities is used to measure the similarity the smaller theabsolute deviation themore the similarity and therefore thelarger the weight In particular absolute deviation of defaultprobabilities between loans i and j is defined as follows dij= |pi - pj| where pi and pj are the default probabilities ofloans i and j respectively Kernel regression is exploited toinvestigate the nonlinear relationship between the absolutedeviation and the weight This process will be introduced inthe next subsection
32 Kernel Regression of Return and Risk Kernel regressionis a nonparameter statistical method to investigate the non-linear relation between random variables which is based onthe kernel density estimation First of all the preliminaries ofkernel estimation are introduced
Given n realizations zj j = 1 n of random variable zthe kernel estimation 119901(119911) of the probability density functionp(z) is defined by
119901 (119911) = 1119899ℎ119899sum119895=1
119870(119911119895 minus 119911ℎ ) (3)
where K(sdot) is a kernel function and h is a smoothingparameter
Kernel function K(sdot) is nonnegative and bounded andmeanwhile satisfies the following properties
(a) intinfinminusinfin119870(119911)119889119911 = 1 (b) intinfin
minusinfin119911119870(119911)119889119911 = 0 (c)
intinfinminusinfin1199112119870(119911)119889119911 lt infinThere are a range of commonly used kernel functions
such as uniform triangular biweight triweight andGaussian[29] Because the kernel estimation is insensitive to the choiceof kernel function we use the Gaussian kernel function dueto its convenient mathematical properties which is written as119870(119911) = (1radic2120587)119890minus11991122
The smoothing parameter h=h(n) is also called thebandwidth that depends on the sample size n Specificallyh(n) and nsdoth(n) decrease to 0 as n tend toinfin
Many literatures reveal that the choice of kernel func-tion does not affect the estimation significantly howeverthe choice of the bandwidth is a vital issue [30 31] Thedetermination of the bandwidth will be shown in detail inSection 53
In the following we introduce the kernel regressionmodel proposed by Nadaraya [32] Theoretically we assumethat each observation is denoted as (X Y) which is a random
4 Mathematical Problems in Engineering
vector R2-valued With the sample set (xj yj)| j = 1 2119899 the kernel estimator 119910 of the target y given its predictiveobservation x is defined as
119910 = 119899sum119895=1
[[
119870((119909 minus 119909119895) ℎ)sum119899119895=1119870((119909 minus 119909119895) ℎ) sdot 119910119895
]] (4)
where K(sdot) is a kernel function and h is the bandwidthFor the instance-based credit risk modeling the set of
historical observations is represented as (pj Rj)| j = 1 2119899 where pj and Rj are the default probability and return rateof the jth loan respectively Thereby the estimation of the ithloanrsquos return could be written as
120583119894 =119899sum119895=1
[[
119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) sdot 119877119895
]] (5)
Note that the determination of loansrsquo default probability willbe introduced in Section 51
Comparing (1) to (5) we can represent the optimal weight119908119894119895 as
119908119894119895 = 119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) (6)
Using the optimal weight 119908119894119895 and the expected return 120583119894derived from (5) (2) can be rewritten as
2119894 =119899sum119895=1
[[
119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) sdot (119877119895 minus 120583119894)
2]] (7)
4 Robust Investment Decision Model
Similar to bond investment P2P lenders can invest a portionof each loan Thus P2P loan investment decisions can betransformed into a credit portfolio optimization problemThis section introduces the portfolio optimization model forinvestment decisions in P2P lending which accounts for theuncertainty of the distribution of the loans We start fromthe classical mean-variance optimization model proposed byMarkowitz [1] to its tractable robust counterpart
41 Robust Optimization Model Based on Relative EntropyConstraints In the classical mean-variance optimizationmodel the optimal asset allocation strategy is identified bysolving the tradeoff between risk and return according toinvestorsrsquo risk preference A portfolio that invests in n assets isrepresented as a vector of weights 120582 isin Rn where each weightdenotes the proportion of wealth allocated to an asset Thenthe return and risk of the portfolio become 120582T120583 and 120582T119881120582respectively where 120583 isin Rn and V isin Rntimesn are the expectedreturn and the covariance matrix of the assetsrsquo returnsunder the probability measure (or probability distribution)P respectively Here P represents the ideal estimated marketcondition where 120583 and V estimated by using all availableinformation including historical observations news expert
knowledge and so on are assumed as the actual expectedreturn and covariance matrix Thus the classical mean-variance portfolio selection problem (MV) can be formulatedas
(MV) min120582
120582T119881120582st 120582T120583 ge 119877lowast
120582 isin Ω(8)
whereΩ sube Rn denotes the set of feasible portfolios and 119877lowast isthe required return rate specified by the investor
In reality the assumption that the expected return 120583and covariance matrix V are known with certainty is lessreasonable It is quite possible that the estimated parametersare different with the actual ones Thus the optimal portfolioidentified by using the estimated inputs parameters 120583 andV directly may be inappropriate Robust optimization seeksfor portfolios that are insensitive to the uncertain in theparameters and the solutions that must be feasible no matterwhat the actual value of the parameters is
The investors might consider a set of probability mea-sures ie an uncertainty set to cover a range of scenariosbased on their assessments and then use robust optimizationto obtain approximate optimal strategies for the worst sce-narios within the uncertainty set In this paper we define Qas the set of probability measures representing the possiblescenarios 120583119876 and 119881119876 as the expected return and covariancematrix estimated under the probability measure 119876 isin QMathematically the robust counterpart of the classical mean-variance optimization problem (RMV) can be written as
(RMV) min120582
sup119876isinQ
120582T119881119876120582st inf
119876isinQ120582T120583119876 ge 119877lowast
120582 isin Ω(9)
It is rational to assume that the actual value of the parametersis in the neighborhood of the estimatorThus we can generatethe uncertainty set Q based on the assumption that themeasures in the set should be not far from the ideal measureP Relative entropy also known as the KullbackndashLeiblerdivergence can be used to measure the difference betweenprobability measures The relative entropy of the measure 119876in Q with respect to the measure P is
119863119870119871 (119876 119875) fl int119902 (119909) ln 119902 (119909)119901 (119909)119889119909 (10)
where 119901(119909) and 119902(119909) are the probability density functions(pdf) of the loansrsquo returns under probability measures P and119876 respectively In the context of mean-variance analysisrelative entropy 119863119870119871(119876 119875) can be rewritten as
119863119870119871 (119876 119875) = 12 [ln |119881| minus ln 10038161003816100381610038161198811198761003816100381610038161003816 + tr (119881minus1119881119876) minus 119899+ (120583 minus 120583119876)T119881minus1 (120583 minus 120583119876)]
(11)
Mathematical Problems in Engineering 5
where 120583 V 120583119876 and 119881119876 carry the same meaning as in (8) and(9) tr(V) |119881| and V be the trace the determinant and thetranspose of V respectively n is the amount of assets in theportfolio
Let U denote the set of parameters (120583119876 119881119876) under themeasure Q in Q Using the constraint of relative entropy wecan rewrite the robust optimization model (9) as
(RMV-RE) min120582
max(120583119876119881119876)isinU
120582T119881119876120582st min
(120583119876119881119876)isinU120582T120583119876 ge 119877lowast
119863119870119871 (119876 119875) le 119870120582 isin Ω
(12)
where K is a positive constant and determines the size ofuncertainty set Parameter K measures the level of uncer-tainty and reflects the investorsrsquo confidence in 120583 and Vestimated under probability measure P ie the greater Krsquosvalue the less confidence
Yam et al [6] prove that the robustmean-variance portfo-lio selection model based on relative entropy method (RMV-RE) can be formulated as quadratic optimization problemwhich is a tractable formulation and can be efficiently solvedThat is
min120582isinR119899
120582T119881 lowast 120582st 120582T120583lowast ge 119877lowast
120582 isin Ω(13)
Herein 120583lowast=120577120583 Vlowast=V+120577(1-120577)120583120583T and 120577 isin (0 1] is relatedto K in (12) closely which reflects the level of confidencein 120583 and V estimated under measure P For example 120577=1means that investors believe the estimated 120583 and V are thetrue parameters And as 120577 decreases the investorrsquos confidenceis weaker The details of the proof are referred to by Yam et al[6]
42 Robust Mean-Variance Portfolio Optimization Model inP2P Lending In the Section 32 we estimated each loanrsquosexpected return and variance of return ie 120583119894 and 120590119894 usingthe instance-based credit risk assessment model Let 120583 =(1205831 1205832 120583119899)T and
=[[[[[[[[[
1 0 00 2 d
d d 00 0 120590119899
]]]]]]]]]
(14)
denote the expected return vector and the covariance matrixof the loansrsquo returns under the probability measure P Herewe assume that the correlation between P2P loans is negligi-ble Now we can rewrite (13) as
Table 1 Description of variables
Variable DescriptionX1 FICO score of the borrower
X2The number of inquiries of the borrower in the last 6
monthsX3 Themonetary amount of the loan
X4The homeownership status of the borrower (0 = rent 1
= own)X5 The debt-to-income ratio of the borrowerX6 The number of accounts delinquentX7 The number of public records in the past 10 yearsY Dependent variable (0 = completed 1 = default)
min120582isinR119899
120582T ( + 120577 (1 minus 120577) 120583120583119879) 120582st 120582T (120577120583) ge 119877lowast
120582 isin Ω(15)
The feasible region Ω of our problem is defined by thefollowing constraints
(1) The value of the portfolio remains at its initial valueiesum119894 120582119894 = 1
(2) Short-selling is forbidden thus 120582119894 ge 0(3) For each loan the amount that lender can invest is
no more than the borrower request mi thereby 120582119894Mle mi where M is the total investment amount andinvestor has available
5 Empirical Analysis
In this section we investigate the validity of the robustmean-variance portfolio optimization model in P2P lending usingthe real-world dataset from a notable P2P lending platformProsper All numerical experiments are performed by usingMATLAB on PC
51 Data Description and Preprocess The dataset for empir-ical study is from a notable P2P lending platform in theUnited States Prosper It consists of 17001 loans including3039 default loans and 13908 completed loans whose issuedates within the period from November 2005 to March 2014
Using the data a credit scoring model is learnt to trans-form the loan attributes into the default probability The loanattributes are as follows the borrowerrsquos FICO score whichreflects borrowerrsquos creditworthiness the borrowerrsquos numberof inquiries in the past six months the monetary amountof the loan the homeownership status of the borrowerthe debt-to-income ratio of the borrower the borrowerrsquoscurrent delinquencies representing the number of accountsdelinquent and the borrowerrsquos number of public records inthe past 10 years (Row 1-7 in Table 1) The target variable isa binary variable (0 represents completed and 1 representsdefault) as described in Row 8 of Table 1
6 Mathematical Problems in Engineering
009500955
009600965
009700975
009800985
009900995
01
CV (h
)002 004 006 008 01 012 014 016 018 020
h
Figure 1 The curve of CV (h)
There exist many credit scoring models to predict thedefault probability of a loan such as Xgboost model [33ndash35] hybrid KMVmodel [36] credit scoring based on geneticalgorithms [37 38] and so on However discussing how tochoose and construct the optimal credit scoring model isbeyond the scope of this study and we use the most popularmodel logistic regression to make the prediction in thispreprocessing step
We randomly divide the dataset into two parts onecontaining 40 of all loans for determining the optimalbandwidth h in (5) which will be described in detail inSection 53 and the second part containing 60 of the loansMoreover using k-fold cross-validation we randomly dividethe second part into 20 subsets each of which containsapproximately 510 loans In each round one of the subsetsis used as the testing set which consists of loans waiting tobe invested thus their pay-back statuses are unknown andall other subsets are taken as a training set which consists ofhistorical loans with known yield
52 Model Description In this paper we propose a robustcredit portfolio optimization model for investment decisionsin P2P lending In order to show its effectiveness we compareit with a benchmark model proposed by Guo et al [5] In thefollowing we describe models in detail
IOM is the instance-based model proposed by Guo etal [5] Each loan is assessed using kernel weights and thehistorical performance of similar loans Then use the classicalmean-variance model (8) to identify the optimal allocationstrategy The performance of this model outperforms somerating-based models as the results of Guo et al [5] show
RIOM is the robust instance-based model in this studyExpected return and risk of each loan are also assessed basedon the ldquoinstance-basedrdquo assessment framework However weuse the robust model of credit portfolio optimization basedon relative entropy method Equation (15) to obtain theoptimal investment decision
We compare the two models by the following procedure(1) Train the credit risk assessment model with the
training set and use the trained model to predict theexpected return (120583119894) and variance (120590119894) of each loan inthe testing set Thus the expected return vector andthe covariance matrix 120583 and V can be obtained
(2) For each model feed the predicted expected returnvector 120583 and the covariance matrix 119881 of the testingloans into the portfolio optimization algorithm andcompute the performance of investment on the opti-mal portfolio
(3) Compare the return rate of the two models
53 Analysis of Results As mentioned before we select theGaussian kernel 119870(120577) = (1radic2120587)119890minus12057722 as the kernel func-tion And the important parameter in the kernel regressionmodel bandwidth h is optimized by the following leave-one-out cross validation
ℎ119900119901119905119894119898119886119897 = argminℎ119862119881 (ℎ)
= argminℎ
119899sum119894=1
(120583ℎ (119901minus119894) minus 120583119894)2 (16)
where 120583ℎ(119901minus119894) is the leave-one-out estimation of expectedreturn rate 120583119894 specifically
120583ℎ (119901minus119894) =119899sum119895=1119895 =119894
[[
119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119895 =119894119870((119901119894 minus 119901119895) ℎ) sdot 119877119895
]] (17)
The curve of CV(h) is exhibited in Figure 1 The shape of thecurve clearly shows a minimal point and h corresponding tothe minimal point is the optimal bandwith for the model
To apply the robust credit portfolio optimization methodto obtain the optimal investment strategy in problems (13)we select the parameter 120577=075 the investment amount M =15 thousand dollars and the required rate of return 119877lowast = 005We also set the risk-free return rate as 0025 which is aboutequivalent to the average yield of T-Bills over the sameperiodAnd we use the MATLAB built-in solver ldquoquaprogrdquo to solvethe two portfolio optimization problems
Table 2 summarizes investment return rate of each testsubset and the average performance of the Prosper dataset Itshows that the two portfolios are almost always efficient andfeasible except subset 16The results also show that the actualperformances of the optimal portfolio derived from RIOMalways outperform the optimal portfolio from IOM Andthe Sharpe ratio shows that median-based optimal portfolioperforms better as well
Mathematical Problems in Engineering 7
1 1098765432The number of parameters set
IOMRIOM
0001002003004005006007008009
Retu
rn ra
te o
f inv
estm
ent
Figure 2 Performance comparison
Table 2 Rate of return from the optimal portfolio on the Prosperdataset
Subset IOM RIOM1 00501 005662 00550 006333 00540 006184 00564 006965 00627 007146 00543 006297 00532 006888 00605 007119 00593 0070610 00546 0066411 00637 0070112 00567 0064013 00468 0056914 00519 0066315 00544 0062016 00357 0047217 00588 0071018 00607 0077419 00544 0065520 00625 00808Average 00553 00662
In order to test and verify that the conclusions obtainedfrom the above experiments are stable we consider dif-ferent investment amounts and required returns as inputparameters for portfolio selection and keep other conditionsunchanged As summarized in Table 3 we consider nineparameters pairs about required return rate 119877lowast and invest-ment amount M
The computational results for each parameters pair aresummarized in Table 4 Table 4 shows performance compar-ison of the two optimal portfolios from the perspectives ofactual return rate of investment The more intuitive resultsare shown in Figure 2 which shows the actual return ratecomparison of the two models The first 9 numbers ofthe horizontal axis in Figure 2 represent the correspondingparameters combinations (sets 1 through 9 fromTable 3) and
Table 3 Investorsrsquo choices of input parameters for portfolio selec-tion
Set Investment amountM Required rate 119877lowast1 $10000 502 $10000 553 $10000 604 $15000 505 $15000 556 $15000 607 $20000 508 $20000 559 $20000 60
the number 10 shows the average We can find that the RIOMmodel outperforms the IOMmodel comprehensively
In conclusion the optimal portfolio identified from therobust optimization model in this study is more efficient thanthe existing model And the performance of our model ismore robust and stable
6 Conclusions
In this paper we formulate a data-driven robust modelof portfolio optimization with relative entropy constraintsbased on an instance-based credit risk assessment frameworkfor investment decisions in P2P lending This P2P lendinginvestment decision model has at least three advantagesFirstly it provides a more refined measure of P2P loansrsquo riskand reveals a more intuitive and quantized risk estimate toinvestors instead of just labelling each loan with a creditgrade Secondly this model can estimate each loanrsquos expectedreturn and risk when the historical observation of the sameborrower is unavailable Finally this model considers theloansrsquo distribution ambiguity (probability measure uncer-tainty) problem and uses relative entropy tomodel parameteruncertainty to ensure the optimal allocation strategy effi-cient and feasible under various actual scenarios Numericalexperiments imply that the P2P lending investment decisionmodel using the robust optimization with relative entropyconstraints provides better performance than existing model
8 Mathematical Problems in Engineering
Table4Investm
entp
erform
anceso
finp
utparametersfor
portfolio
selection
Subset
119877lowast=5
119877lowast
=55
119877lowast
=6
119877lowast=5
119877lowast
=55
119877lowast
=6
119877lowast
=5
119877lowast =
55
119877lowast =
6
M=10000
M=10000
M=10000
M=15000
M=15000
M=15000
M=20000
M=20000
M=20000
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
100598
007
27006
0100762
00502
007
2200501
005
6600558
008
3900520
008
3300594
007
7400544
006
8900691
006
492
00500
005
92006
01007
79006
75008
5100550
006
3300517
006
2000551
006
4800504
005
84006
64008
95006
6100769
3004
41005
7100491
006
9800735
009
3400540
006
1800598
007
7800631
006
4600503
005
6800554
007
37006
47008
694
00525
006
0200658
009
0300636
007
5400564
006
9600512
006
2900553
008
4700566
006
4800617
008
5600518
006
355
00532
006
2000631
008
2900513
007
3100627
007
1400566
007
2100616
008
9600576
007
0900547
006
9200610
007
716
00634
00747
00564
00762
00717
01105
00543
006
2900570
007
7200585
008
7400584
00 6
8 300528
007
0100516
003
857
00613
007
3600547
007
5400551
008
8400532
006
8800528
007
2700620
005
66004
81005
49004
85006
31004
60007
598
00529
005
9400505
006
16006
85008
58006
05007
1100545
00768
006
45008
5500545
007
0100628
008
2300592
008
459
00548
006
4700550
007
3600559
004
9300593
007
0600507
005
7600574
01214
00535
005
8600561
00764
00574
01038
10004
74005
74004
72006
3400499
007
9400546
006
6400528
006
3400622
006
3100514
005
9700582
006
83006
8900532
1100597
007
30006
02007
95006
6101090
00637
007
0100562
007
5600498
006
6200531
005
8400569
006
5700572
01141
12006
4400768
00541
006
7300624
01 042
00567
006
4000529
006
7700574
009
8300551
006
7800536
007
3400618
006
8713
00635
007
8500709
008
9500532
006
62004
68005
6900637
008
8000504
009
1300555
006
9500636
008
2400616
01157
1400593
00744
00626
007
5100634
01204
00519
006
6300568
007
1600614
01162
00577
006
7400541
006
3600572
008
1815
00523
006
36004
85006
0900571
009
8700544
006
2000577
00764
00633
008
0200597
007
7500536
007
0600595
007
0416
00549
007
05006
84008
9300508
01264
00357
004
72006
42008
5100573
005
4900593
006
9800616
008
0700551
00748
1700549
006
6600549
007
5700538
006
7700588
007
10006
74008
6700615
004
9600535
006
41004
87006
3600 69 6
009
1518
00546
006
2900512
006
1500560
006
10006
07007
7400585
007
32006
87008
2400599
007
2900576
00746
00507
01069
1900492
005
5500572
006
8500657
004
3600544
006
5500434
006
3300589
007
5900581
006
73004
72006
3800623
01148
2000554
006
45004
13005
0400596
003
6600625
008
0800562
006
8700698
009
7800518
007
09006
01007
3400638
00744
Average
00554
006
6400566
007
3200598
008
2300553
006
6200560
007
3100595
008
0700552
006
6300564
007
2900597
008
19
Mathematical Problems in Engineering 9
Data Availability
The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx
Conflicts of Interest
The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo
Acknowledgments
The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above
References
[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952
[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959
[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002
[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002
[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016
[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016
[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014
[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011
[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968
[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970
[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980
[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983
[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011
[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007
[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000
[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006
[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007
[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015
[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018
[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009
[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011
[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010
[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010
[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014
[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001
[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007
[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018
[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018
[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007
[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004
[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013
[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965
10 Mathematical Problems in Engineering
[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015
[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017
[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018
[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012
[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012
[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
2 Mathematical Problems in Engineering
individual investors just possess small amount of funds theyneed more refined risk assessment methods and investmentstrategies
Similar to bond investment P2P investors can fund aportion not the whole of each loan Therefore investors candecide which loans to invest and meanwhile determine theamount of investment for each loan This mechanism allowsinvestors to construct a credit portfolio to mitigate risk
Markowitz [1] proposes the famous mean-variancemodel which is still widely used in portfolio selection andrisk management From then on researchers propose avariety of mean-risk models such as mean-downside riskmodel [2] mean-VaR model [3] mean-CVaR model [4] andso on In practice the distribution of the assets needs tobe estimated firstly and then the optimal portfolio can beidentified by the optimization model
For P2P lending investment as mentioned above suchprocedures face at least two major challenges ie the defi-ciency of loansrsquo historical observations and the ambiguityproblem of estimated loansrsquo distribution (probabilitymeasureuncertainty problem) Thus this paper proposes a data-driven robust model of portfolio optimization based onrelative entropy constraints combined with an instance-basedcredit risk assessment method
Specifically we use the ldquoinstance-basedrdquo credit riskassessment method proposed by Guo et al [5] to evaluate thereturn and risk of each loan without sufficient historical dataof loans for each individual borrower In this instance-basedframework the expected return of each loan is predictedas a weighted average of historical loans of other similarborrowers where the optimal weights are learnt based onkernel regression Furthermore using the moment informa-tion (mean and variance) of the new loans we formulatethe robust portfolio optimization model with relative entropyconstraints which could obtain an optimal portfolio undertheworst scenario and has the ability of reducing the potentialloss caused by the uncertainty of loans distribution
Our work is somewhat related to the paper by Guo et al[5] and the paper by Yam et al [6] Guo et al [5] introducethe instance-based framework into credit risk assessmentof P2P loan and use the classical mean-variance model toobtain the optimal allocation Yam et al [6] derive a robustmean-variance optimization model with relative entropyconstrains on the uncertainty of the interaction between thereturns of different assets and discuss its mathematical andfinancial properties in portfolio selection Although someother scholars have contributed novel insights into creditrisk assessment of P2P lending and robust optimizationto the best of our knowledge few have taken both intoconsideration synthetically The main contribution of thispaper is that we propose a data-driven robust portfoliooptimization model based on relative entropy constraintscombined with instance-based risk assessment frameworkfor P2P loan investment and obtain superior performance innumerical experiments
The rest of this paper is organized as follows Section 2provides the literature review Section 3 introduces theinstance-based model for credit risk assessment as well as themathematical framework of kernel regression approach In
Section 4 we elaborate the robust optimization model basedon relative entropy method and formulate a robust mean-variance optimizationmodel for P2P lending investmentTheempirical results on the effectiveness of ourmodel is reportedin Section 5 Finally Section 6 concludes this work
2 Literature Review
In order to assess risk and assist investment decisions makingin P2P lending researchers have donemany studies Emekteret al [7] explore the dominated factors that explain thefunding success and credit risk and meanwhile measure theperformance of P2P loans They find that credit grade debt-to-income ratio FICO score and revolving line utilizationplay an important role in loan defaults furthermore loanswith lower credit grade and longer duration may result inhigh mortality rate and higher interest rates charged on lowcredit grade borrowers are not sufficient to cover the potentialloss for the higher likelihood of loan defaults Thus theauthors suggest that investors should invest more to highgrade loans Similarly Berkovichrsquos [8] study finds that highquality loans offer excess return
The above researches investigate the factors determiningthe credit risk and analyze the performance of P2P loanshowever they do not propose a mechanism which assistindividual investors in allocating loans effectively andmakingoptimal investment decisions
To help personal lenders mitigate the risk the popularonline P2P platforms like Lending Club and Prosper havedeveloped credit scoring systems to assess the creditworthi-ness of each borrower based on data mining or machinelearning techniques There is a large body of existing lit-eratures concerned with credit rating using data miningtechniques for example linear discriminate analysis (LDA)[9] k-nearest neighbors [10] logistic regression [11] classifi-cation and regression trees (CART) [12] Markov chains [13]survival analysis [14] artificial neural network (ANN) [15]genetic methods [16] support vector machine (SVM) [17 18]lasso-probit [19] and so on
In the portfolio selection problem full knowledge ofthe assetsrsquo distribution is usually assumed to determine theoptimal portfolio In most real-life applications we need toapproximate the assetsrsquo distribution However the approx-imations are not necessarily accurate and it is known asthe distribution ambiguity (probability measure uncertainty)problem
The robust optimization algorithm is an attractive wayto solve the portfolio selection problem under distributionambiguity As the exact parameters are unavailable Natarajanet al [20] use a set of parameters (which represent differentdistributions or scenarios) rather than a point estimationof the parameters to formulate the asset allocation prob-lem Following this idea there are different ways to modelambiguity by using a set of parameters Chen et al [21] takethe lower partial moments and CVaR as two risk measuresand consider a tight bound which are likely to cover thepossible parameters Epstein [22] considered intervals thatmay include the actual parameters Natarajan et al [23]use a piecewise-linear concave utility function to derive
Mathematical Problems in Engineering 3
accurate and estimated optimal strategies for the expectedutility model in the portfolio optimization issue under theworst-case scenarios Pac and Pinar [24] use an ellipsoidaluncertainty set to represent the distribution ambiguity toidentify the optimal portfolio
Since relative entropy has the ability to measure thedifference between two probability distributions (probabilitymeasures) it can be used to construct the uncertainty set forrobust optimization In the studies of Hansen and Sargent[25] and Calafiore [26] relative entropy is used to modeluncertainty and obtain the optimal investment decisionYam et al [6] derive a robust mean-variance optimizationmodel with relative entropy constrains on the uncertainty ofthe interaction between the returns of different assets anddiscuss its mathematical and financial properties in portfolioselection
In recent years research ondata-drivenmethods has beenwell studied In this framework it is assumed that investorsonly possess the information about history data of assetreturn Bertsimas et al [27] use KS test 1205942 test Anderson-Darling test and some other testing tools to construct uncer-tainty sets and take the worst case of each set to formulate therobust optimization They assume that the uncertainty setsare defined by certain structures and sizes based on the datapoints available While the structure of uncertainty set in ourstudy is not predefined we consider the uncertainty of meancovariance and distribution synthetically Kang et al [28]propose a data-driven robust mean-CVaR portfolio selectionmodel under the condition of distribution ambiguity andadopt a nonparametric bootstrap approach to calibrate thelevels of ambiguity Their work is based on the mean-CVaRframeworkwith data of stock indices while our work is basedon the mean-variance framework with data of P2P loans
3 Instance-Based Model forCredit Risk Assessment
Using historical data to evaluate future performance andpotential loss is a convention However unlike bonds orstocks investment the historical yield data about the sameP2P borrower is usually unavailableThus the risk assessmentof new loan is very challenging In this section we brieflyintroduce the instance-based credit risk assessment modelproposed by Guo et al [5]
31 Instance-Based Assessment Framework In this instance-based assessment framework the expected return of eachloan is estimated as a weighted average of historical observa-tions of other borrowersrsquo closed loans Specifically for a newloan i using n past loans each with an historical return 119877119895 (j= 1 2 n) we can calculate the expected retrun of loan i 120583119894based on a weighted average of past loansrsquo actual returns
120583119894 =119899sum119895=1
119908119894119895119877119895 (1)
where 119908119894119895 denotes the weight of loan j for predicting theexpected retrun of loan i The weight depends on thesimilarity between loan i and loan j Intuitively the more
the similarity the greater the weight The calculation of theweight will be introduced in Section 32
The weighted returns of the past loans are assumed ashistorical observations of a new loan According to this lineof thought taking variance as the risk measure weightedvariance of past loans are used to assess the new loanrsquos riskthat is
1205902119894 =119899sum119895=1
119908119894119895 (119877119895 minus 120583119894)2 (2)
where119908119894119895 119877119895 and 120583119894 have the same meanings as (1)The absolute deviation between two loansrsquo default prob-
abilities is used to measure the similarity the smaller theabsolute deviation themore the similarity and therefore thelarger the weight In particular absolute deviation of defaultprobabilities between loans i and j is defined as follows dij= |pi - pj| where pi and pj are the default probabilities ofloans i and j respectively Kernel regression is exploited toinvestigate the nonlinear relationship between the absolutedeviation and the weight This process will be introduced inthe next subsection
32 Kernel Regression of Return and Risk Kernel regressionis a nonparameter statistical method to investigate the non-linear relation between random variables which is based onthe kernel density estimation First of all the preliminaries ofkernel estimation are introduced
Given n realizations zj j = 1 n of random variable zthe kernel estimation 119901(119911) of the probability density functionp(z) is defined by
119901 (119911) = 1119899ℎ119899sum119895=1
119870(119911119895 minus 119911ℎ ) (3)
where K(sdot) is a kernel function and h is a smoothingparameter
Kernel function K(sdot) is nonnegative and bounded andmeanwhile satisfies the following properties
(a) intinfinminusinfin119870(119911)119889119911 = 1 (b) intinfin
minusinfin119911119870(119911)119889119911 = 0 (c)
intinfinminusinfin1199112119870(119911)119889119911 lt infinThere are a range of commonly used kernel functions
such as uniform triangular biweight triweight andGaussian[29] Because the kernel estimation is insensitive to the choiceof kernel function we use the Gaussian kernel function dueto its convenient mathematical properties which is written as119870(119911) = (1radic2120587)119890minus11991122
The smoothing parameter h=h(n) is also called thebandwidth that depends on the sample size n Specificallyh(n) and nsdoth(n) decrease to 0 as n tend toinfin
Many literatures reveal that the choice of kernel func-tion does not affect the estimation significantly howeverthe choice of the bandwidth is a vital issue [30 31] Thedetermination of the bandwidth will be shown in detail inSection 53
In the following we introduce the kernel regressionmodel proposed by Nadaraya [32] Theoretically we assumethat each observation is denoted as (X Y) which is a random
4 Mathematical Problems in Engineering
vector R2-valued With the sample set (xj yj)| j = 1 2119899 the kernel estimator 119910 of the target y given its predictiveobservation x is defined as
119910 = 119899sum119895=1
[[
119870((119909 minus 119909119895) ℎ)sum119899119895=1119870((119909 minus 119909119895) ℎ) sdot 119910119895
]] (4)
where K(sdot) is a kernel function and h is the bandwidthFor the instance-based credit risk modeling the set of
historical observations is represented as (pj Rj)| j = 1 2119899 where pj and Rj are the default probability and return rateof the jth loan respectively Thereby the estimation of the ithloanrsquos return could be written as
120583119894 =119899sum119895=1
[[
119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) sdot 119877119895
]] (5)
Note that the determination of loansrsquo default probability willbe introduced in Section 51
Comparing (1) to (5) we can represent the optimal weight119908119894119895 as
119908119894119895 = 119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) (6)
Using the optimal weight 119908119894119895 and the expected return 120583119894derived from (5) (2) can be rewritten as
2119894 =119899sum119895=1
[[
119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) sdot (119877119895 minus 120583119894)
2]] (7)
4 Robust Investment Decision Model
Similar to bond investment P2P lenders can invest a portionof each loan Thus P2P loan investment decisions can betransformed into a credit portfolio optimization problemThis section introduces the portfolio optimization model forinvestment decisions in P2P lending which accounts for theuncertainty of the distribution of the loans We start fromthe classical mean-variance optimization model proposed byMarkowitz [1] to its tractable robust counterpart
41 Robust Optimization Model Based on Relative EntropyConstraints In the classical mean-variance optimizationmodel the optimal asset allocation strategy is identified bysolving the tradeoff between risk and return according toinvestorsrsquo risk preference A portfolio that invests in n assets isrepresented as a vector of weights 120582 isin Rn where each weightdenotes the proportion of wealth allocated to an asset Thenthe return and risk of the portfolio become 120582T120583 and 120582T119881120582respectively where 120583 isin Rn and V isin Rntimesn are the expectedreturn and the covariance matrix of the assetsrsquo returnsunder the probability measure (or probability distribution)P respectively Here P represents the ideal estimated marketcondition where 120583 and V estimated by using all availableinformation including historical observations news expert
knowledge and so on are assumed as the actual expectedreturn and covariance matrix Thus the classical mean-variance portfolio selection problem (MV) can be formulatedas
(MV) min120582
120582T119881120582st 120582T120583 ge 119877lowast
120582 isin Ω(8)
whereΩ sube Rn denotes the set of feasible portfolios and 119877lowast isthe required return rate specified by the investor
In reality the assumption that the expected return 120583and covariance matrix V are known with certainty is lessreasonable It is quite possible that the estimated parametersare different with the actual ones Thus the optimal portfolioidentified by using the estimated inputs parameters 120583 andV directly may be inappropriate Robust optimization seeksfor portfolios that are insensitive to the uncertain in theparameters and the solutions that must be feasible no matterwhat the actual value of the parameters is
The investors might consider a set of probability mea-sures ie an uncertainty set to cover a range of scenariosbased on their assessments and then use robust optimizationto obtain approximate optimal strategies for the worst sce-narios within the uncertainty set In this paper we define Qas the set of probability measures representing the possiblescenarios 120583119876 and 119881119876 as the expected return and covariancematrix estimated under the probability measure 119876 isin QMathematically the robust counterpart of the classical mean-variance optimization problem (RMV) can be written as
(RMV) min120582
sup119876isinQ
120582T119881119876120582st inf
119876isinQ120582T120583119876 ge 119877lowast
120582 isin Ω(9)
It is rational to assume that the actual value of the parametersis in the neighborhood of the estimatorThus we can generatethe uncertainty set Q based on the assumption that themeasures in the set should be not far from the ideal measureP Relative entropy also known as the KullbackndashLeiblerdivergence can be used to measure the difference betweenprobability measures The relative entropy of the measure 119876in Q with respect to the measure P is
119863119870119871 (119876 119875) fl int119902 (119909) ln 119902 (119909)119901 (119909)119889119909 (10)
where 119901(119909) and 119902(119909) are the probability density functions(pdf) of the loansrsquo returns under probability measures P and119876 respectively In the context of mean-variance analysisrelative entropy 119863119870119871(119876 119875) can be rewritten as
119863119870119871 (119876 119875) = 12 [ln |119881| minus ln 10038161003816100381610038161198811198761003816100381610038161003816 + tr (119881minus1119881119876) minus 119899+ (120583 minus 120583119876)T119881minus1 (120583 minus 120583119876)]
(11)
Mathematical Problems in Engineering 5
where 120583 V 120583119876 and 119881119876 carry the same meaning as in (8) and(9) tr(V) |119881| and V be the trace the determinant and thetranspose of V respectively n is the amount of assets in theportfolio
Let U denote the set of parameters (120583119876 119881119876) under themeasure Q in Q Using the constraint of relative entropy wecan rewrite the robust optimization model (9) as
(RMV-RE) min120582
max(120583119876119881119876)isinU
120582T119881119876120582st min
(120583119876119881119876)isinU120582T120583119876 ge 119877lowast
119863119870119871 (119876 119875) le 119870120582 isin Ω
(12)
where K is a positive constant and determines the size ofuncertainty set Parameter K measures the level of uncer-tainty and reflects the investorsrsquo confidence in 120583 and Vestimated under probability measure P ie the greater Krsquosvalue the less confidence
Yam et al [6] prove that the robustmean-variance portfo-lio selection model based on relative entropy method (RMV-RE) can be formulated as quadratic optimization problemwhich is a tractable formulation and can be efficiently solvedThat is
min120582isinR119899
120582T119881 lowast 120582st 120582T120583lowast ge 119877lowast
120582 isin Ω(13)
Herein 120583lowast=120577120583 Vlowast=V+120577(1-120577)120583120583T and 120577 isin (0 1] is relatedto K in (12) closely which reflects the level of confidencein 120583 and V estimated under measure P For example 120577=1means that investors believe the estimated 120583 and V are thetrue parameters And as 120577 decreases the investorrsquos confidenceis weaker The details of the proof are referred to by Yam et al[6]
42 Robust Mean-Variance Portfolio Optimization Model inP2P Lending In the Section 32 we estimated each loanrsquosexpected return and variance of return ie 120583119894 and 120590119894 usingthe instance-based credit risk assessment model Let 120583 =(1205831 1205832 120583119899)T and
=[[[[[[[[[
1 0 00 2 d
d d 00 0 120590119899
]]]]]]]]]
(14)
denote the expected return vector and the covariance matrixof the loansrsquo returns under the probability measure P Herewe assume that the correlation between P2P loans is negligi-ble Now we can rewrite (13) as
Table 1 Description of variables
Variable DescriptionX1 FICO score of the borrower
X2The number of inquiries of the borrower in the last 6
monthsX3 Themonetary amount of the loan
X4The homeownership status of the borrower (0 = rent 1
= own)X5 The debt-to-income ratio of the borrowerX6 The number of accounts delinquentX7 The number of public records in the past 10 yearsY Dependent variable (0 = completed 1 = default)
min120582isinR119899
120582T ( + 120577 (1 minus 120577) 120583120583119879) 120582st 120582T (120577120583) ge 119877lowast
120582 isin Ω(15)
The feasible region Ω of our problem is defined by thefollowing constraints
(1) The value of the portfolio remains at its initial valueiesum119894 120582119894 = 1
(2) Short-selling is forbidden thus 120582119894 ge 0(3) For each loan the amount that lender can invest is
no more than the borrower request mi thereby 120582119894Mle mi where M is the total investment amount andinvestor has available
5 Empirical Analysis
In this section we investigate the validity of the robustmean-variance portfolio optimization model in P2P lending usingthe real-world dataset from a notable P2P lending platformProsper All numerical experiments are performed by usingMATLAB on PC
51 Data Description and Preprocess The dataset for empir-ical study is from a notable P2P lending platform in theUnited States Prosper It consists of 17001 loans including3039 default loans and 13908 completed loans whose issuedates within the period from November 2005 to March 2014
Using the data a credit scoring model is learnt to trans-form the loan attributes into the default probability The loanattributes are as follows the borrowerrsquos FICO score whichreflects borrowerrsquos creditworthiness the borrowerrsquos numberof inquiries in the past six months the monetary amountof the loan the homeownership status of the borrowerthe debt-to-income ratio of the borrower the borrowerrsquoscurrent delinquencies representing the number of accountsdelinquent and the borrowerrsquos number of public records inthe past 10 years (Row 1-7 in Table 1) The target variable isa binary variable (0 represents completed and 1 representsdefault) as described in Row 8 of Table 1
6 Mathematical Problems in Engineering
009500955
009600965
009700975
009800985
009900995
01
CV (h
)002 004 006 008 01 012 014 016 018 020
h
Figure 1 The curve of CV (h)
There exist many credit scoring models to predict thedefault probability of a loan such as Xgboost model [33ndash35] hybrid KMVmodel [36] credit scoring based on geneticalgorithms [37 38] and so on However discussing how tochoose and construct the optimal credit scoring model isbeyond the scope of this study and we use the most popularmodel logistic regression to make the prediction in thispreprocessing step
We randomly divide the dataset into two parts onecontaining 40 of all loans for determining the optimalbandwidth h in (5) which will be described in detail inSection 53 and the second part containing 60 of the loansMoreover using k-fold cross-validation we randomly dividethe second part into 20 subsets each of which containsapproximately 510 loans In each round one of the subsetsis used as the testing set which consists of loans waiting tobe invested thus their pay-back statuses are unknown andall other subsets are taken as a training set which consists ofhistorical loans with known yield
52 Model Description In this paper we propose a robustcredit portfolio optimization model for investment decisionsin P2P lending In order to show its effectiveness we compareit with a benchmark model proposed by Guo et al [5] In thefollowing we describe models in detail
IOM is the instance-based model proposed by Guo etal [5] Each loan is assessed using kernel weights and thehistorical performance of similar loans Then use the classicalmean-variance model (8) to identify the optimal allocationstrategy The performance of this model outperforms somerating-based models as the results of Guo et al [5] show
RIOM is the robust instance-based model in this studyExpected return and risk of each loan are also assessed basedon the ldquoinstance-basedrdquo assessment framework However weuse the robust model of credit portfolio optimization basedon relative entropy method Equation (15) to obtain theoptimal investment decision
We compare the two models by the following procedure(1) Train the credit risk assessment model with the
training set and use the trained model to predict theexpected return (120583119894) and variance (120590119894) of each loan inthe testing set Thus the expected return vector andthe covariance matrix 120583 and V can be obtained
(2) For each model feed the predicted expected returnvector 120583 and the covariance matrix 119881 of the testingloans into the portfolio optimization algorithm andcompute the performance of investment on the opti-mal portfolio
(3) Compare the return rate of the two models
53 Analysis of Results As mentioned before we select theGaussian kernel 119870(120577) = (1radic2120587)119890minus12057722 as the kernel func-tion And the important parameter in the kernel regressionmodel bandwidth h is optimized by the following leave-one-out cross validation
ℎ119900119901119905119894119898119886119897 = argminℎ119862119881 (ℎ)
= argminℎ
119899sum119894=1
(120583ℎ (119901minus119894) minus 120583119894)2 (16)
where 120583ℎ(119901minus119894) is the leave-one-out estimation of expectedreturn rate 120583119894 specifically
120583ℎ (119901minus119894) =119899sum119895=1119895 =119894
[[
119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119895 =119894119870((119901119894 minus 119901119895) ℎ) sdot 119877119895
]] (17)
The curve of CV(h) is exhibited in Figure 1 The shape of thecurve clearly shows a minimal point and h corresponding tothe minimal point is the optimal bandwith for the model
To apply the robust credit portfolio optimization methodto obtain the optimal investment strategy in problems (13)we select the parameter 120577=075 the investment amount M =15 thousand dollars and the required rate of return 119877lowast = 005We also set the risk-free return rate as 0025 which is aboutequivalent to the average yield of T-Bills over the sameperiodAnd we use the MATLAB built-in solver ldquoquaprogrdquo to solvethe two portfolio optimization problems
Table 2 summarizes investment return rate of each testsubset and the average performance of the Prosper dataset Itshows that the two portfolios are almost always efficient andfeasible except subset 16The results also show that the actualperformances of the optimal portfolio derived from RIOMalways outperform the optimal portfolio from IOM Andthe Sharpe ratio shows that median-based optimal portfolioperforms better as well
Mathematical Problems in Engineering 7
1 1098765432The number of parameters set
IOMRIOM
0001002003004005006007008009
Retu
rn ra
te o
f inv
estm
ent
Figure 2 Performance comparison
Table 2 Rate of return from the optimal portfolio on the Prosperdataset
Subset IOM RIOM1 00501 005662 00550 006333 00540 006184 00564 006965 00627 007146 00543 006297 00532 006888 00605 007119 00593 0070610 00546 0066411 00637 0070112 00567 0064013 00468 0056914 00519 0066315 00544 0062016 00357 0047217 00588 0071018 00607 0077419 00544 0065520 00625 00808Average 00553 00662
In order to test and verify that the conclusions obtainedfrom the above experiments are stable we consider dif-ferent investment amounts and required returns as inputparameters for portfolio selection and keep other conditionsunchanged As summarized in Table 3 we consider nineparameters pairs about required return rate 119877lowast and invest-ment amount M
The computational results for each parameters pair aresummarized in Table 4 Table 4 shows performance compar-ison of the two optimal portfolios from the perspectives ofactual return rate of investment The more intuitive resultsare shown in Figure 2 which shows the actual return ratecomparison of the two models The first 9 numbers ofthe horizontal axis in Figure 2 represent the correspondingparameters combinations (sets 1 through 9 fromTable 3) and
Table 3 Investorsrsquo choices of input parameters for portfolio selec-tion
Set Investment amountM Required rate 119877lowast1 $10000 502 $10000 553 $10000 604 $15000 505 $15000 556 $15000 607 $20000 508 $20000 559 $20000 60
the number 10 shows the average We can find that the RIOMmodel outperforms the IOMmodel comprehensively
In conclusion the optimal portfolio identified from therobust optimization model in this study is more efficient thanthe existing model And the performance of our model ismore robust and stable
6 Conclusions
In this paper we formulate a data-driven robust modelof portfolio optimization with relative entropy constraintsbased on an instance-based credit risk assessment frameworkfor investment decisions in P2P lending This P2P lendinginvestment decision model has at least three advantagesFirstly it provides a more refined measure of P2P loansrsquo riskand reveals a more intuitive and quantized risk estimate toinvestors instead of just labelling each loan with a creditgrade Secondly this model can estimate each loanrsquos expectedreturn and risk when the historical observation of the sameborrower is unavailable Finally this model considers theloansrsquo distribution ambiguity (probability measure uncer-tainty) problem and uses relative entropy tomodel parameteruncertainty to ensure the optimal allocation strategy effi-cient and feasible under various actual scenarios Numericalexperiments imply that the P2P lending investment decisionmodel using the robust optimization with relative entropyconstraints provides better performance than existing model
8 Mathematical Problems in Engineering
Table4Investm
entp
erform
anceso
finp
utparametersfor
portfolio
selection
Subset
119877lowast=5
119877lowast
=55
119877lowast
=6
119877lowast=5
119877lowast
=55
119877lowast
=6
119877lowast
=5
119877lowast =
55
119877lowast =
6
M=10000
M=10000
M=10000
M=15000
M=15000
M=15000
M=20000
M=20000
M=20000
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
100598
007
27006
0100762
00502
007
2200501
005
6600558
008
3900520
008
3300594
007
7400544
006
8900691
006
492
00500
005
92006
01007
79006
75008
5100550
006
3300517
006
2000551
006
4800504
005
84006
64008
95006
6100769
3004
41005
7100491
006
9800735
009
3400540
006
1800598
007
7800631
006
4600503
005
6800554
007
37006
47008
694
00525
006
0200658
009
0300636
007
5400564
006
9600512
006
2900553
008
4700566
006
4800617
008
5600518
006
355
00532
006
2000631
008
2900513
007
3100627
007
1400566
007
2100616
008
9600576
007
0900547
006
9200610
007
716
00634
00747
00564
00762
00717
01105
00543
006
2900570
007
7200585
008
7400584
00 6
8 300528
007
0100516
003
857
00613
007
3600547
007
5400551
008
8400532
006
8800528
007
2700620
005
66004
81005
49004
85006
31004
60007
598
00529
005
9400505
006
16006
85008
58006
05007
1100545
00768
006
45008
5500545
007
0100628
008
2300592
008
459
00548
006
4700550
007
3600559
004
9300593
007
0600507
005
7600574
01214
00535
005
8600561
00764
00574
01038
10004
74005
74004
72006
3400499
007
9400546
006
6400528
006
3400622
006
3100514
005
9700582
006
83006
8900532
1100597
007
30006
02007
95006
6101090
00637
007
0100562
007
5600498
006
6200531
005
8400569
006
5700572
01141
12006
4400768
00541
006
7300624
01 042
00567
006
4000529
006
7700574
009
8300551
006
7800536
007
3400618
006
8713
00635
007
8500709
008
9500532
006
62004
68005
6900637
008
8000504
009
1300555
006
9500636
008
2400616
01157
1400593
00744
00626
007
5100634
01204
00519
006
6300568
007
1600614
01162
00577
006
7400541
006
3600572
008
1815
00523
006
36004
85006
0900571
009
8700544
006
2000577
00764
00633
008
0200597
007
7500536
007
0600595
007
0416
00549
007
05006
84008
9300508
01264
00357
004
72006
42008
5100573
005
4900593
006
9800616
008
0700551
00748
1700549
006
6600549
007
5700538
006
7700588
007
10006
74008
6700615
004
9600535
006
41004
87006
3600 69 6
009
1518
00546
006
2900512
006
1500560
006
10006
07007
7400585
007
32006
87008
2400599
007
2900576
00746
00507
01069
1900492
005
5500572
006
8500657
004
3600544
006
5500434
006
3300589
007
5900581
006
73004
72006
3800623
01148
2000554
006
45004
13005
0400596
003
6600625
008
0800562
006
8700698
009
7800518
007
09006
01007
3400638
00744
Average
00554
006
6400566
007
3200598
008
2300553
006
6200560
007
3100595
008
0700552
006
6300564
007
2900597
008
19
Mathematical Problems in Engineering 9
Data Availability
The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx
Conflicts of Interest
The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo
Acknowledgments
The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above
References
[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952
[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959
[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002
[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002
[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016
[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016
[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014
[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011
[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968
[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970
[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980
[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983
[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011
[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007
[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000
[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006
[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007
[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015
[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018
[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009
[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011
[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010
[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010
[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014
[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001
[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007
[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018
[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018
[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007
[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004
[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013
[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965
10 Mathematical Problems in Engineering
[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015
[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017
[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018
[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012
[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012
[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
Mathematical Problems in Engineering 3
accurate and estimated optimal strategies for the expectedutility model in the portfolio optimization issue under theworst-case scenarios Pac and Pinar [24] use an ellipsoidaluncertainty set to represent the distribution ambiguity toidentify the optimal portfolio
Since relative entropy has the ability to measure thedifference between two probability distributions (probabilitymeasures) it can be used to construct the uncertainty set forrobust optimization In the studies of Hansen and Sargent[25] and Calafiore [26] relative entropy is used to modeluncertainty and obtain the optimal investment decisionYam et al [6] derive a robust mean-variance optimizationmodel with relative entropy constrains on the uncertainty ofthe interaction between the returns of different assets anddiscuss its mathematical and financial properties in portfolioselection
In recent years research ondata-drivenmethods has beenwell studied In this framework it is assumed that investorsonly possess the information about history data of assetreturn Bertsimas et al [27] use KS test 1205942 test Anderson-Darling test and some other testing tools to construct uncer-tainty sets and take the worst case of each set to formulate therobust optimization They assume that the uncertainty setsare defined by certain structures and sizes based on the datapoints available While the structure of uncertainty set in ourstudy is not predefined we consider the uncertainty of meancovariance and distribution synthetically Kang et al [28]propose a data-driven robust mean-CVaR portfolio selectionmodel under the condition of distribution ambiguity andadopt a nonparametric bootstrap approach to calibrate thelevels of ambiguity Their work is based on the mean-CVaRframeworkwith data of stock indices while our work is basedon the mean-variance framework with data of P2P loans
3 Instance-Based Model forCredit Risk Assessment
Using historical data to evaluate future performance andpotential loss is a convention However unlike bonds orstocks investment the historical yield data about the sameP2P borrower is usually unavailableThus the risk assessmentof new loan is very challenging In this section we brieflyintroduce the instance-based credit risk assessment modelproposed by Guo et al [5]
31 Instance-Based Assessment Framework In this instance-based assessment framework the expected return of eachloan is estimated as a weighted average of historical observa-tions of other borrowersrsquo closed loans Specifically for a newloan i using n past loans each with an historical return 119877119895 (j= 1 2 n) we can calculate the expected retrun of loan i 120583119894based on a weighted average of past loansrsquo actual returns
120583119894 =119899sum119895=1
119908119894119895119877119895 (1)
where 119908119894119895 denotes the weight of loan j for predicting theexpected retrun of loan i The weight depends on thesimilarity between loan i and loan j Intuitively the more
the similarity the greater the weight The calculation of theweight will be introduced in Section 32
The weighted returns of the past loans are assumed ashistorical observations of a new loan According to this lineof thought taking variance as the risk measure weightedvariance of past loans are used to assess the new loanrsquos riskthat is
1205902119894 =119899sum119895=1
119908119894119895 (119877119895 minus 120583119894)2 (2)
where119908119894119895 119877119895 and 120583119894 have the same meanings as (1)The absolute deviation between two loansrsquo default prob-
abilities is used to measure the similarity the smaller theabsolute deviation themore the similarity and therefore thelarger the weight In particular absolute deviation of defaultprobabilities between loans i and j is defined as follows dij= |pi - pj| where pi and pj are the default probabilities ofloans i and j respectively Kernel regression is exploited toinvestigate the nonlinear relationship between the absolutedeviation and the weight This process will be introduced inthe next subsection
32 Kernel Regression of Return and Risk Kernel regressionis a nonparameter statistical method to investigate the non-linear relation between random variables which is based onthe kernel density estimation First of all the preliminaries ofkernel estimation are introduced
Given n realizations zj j = 1 n of random variable zthe kernel estimation 119901(119911) of the probability density functionp(z) is defined by
119901 (119911) = 1119899ℎ119899sum119895=1
119870(119911119895 minus 119911ℎ ) (3)
where K(sdot) is a kernel function and h is a smoothingparameter
Kernel function K(sdot) is nonnegative and bounded andmeanwhile satisfies the following properties
(a) intinfinminusinfin119870(119911)119889119911 = 1 (b) intinfin
minusinfin119911119870(119911)119889119911 = 0 (c)
intinfinminusinfin1199112119870(119911)119889119911 lt infinThere are a range of commonly used kernel functions
such as uniform triangular biweight triweight andGaussian[29] Because the kernel estimation is insensitive to the choiceof kernel function we use the Gaussian kernel function dueto its convenient mathematical properties which is written as119870(119911) = (1radic2120587)119890minus11991122
The smoothing parameter h=h(n) is also called thebandwidth that depends on the sample size n Specificallyh(n) and nsdoth(n) decrease to 0 as n tend toinfin
Many literatures reveal that the choice of kernel func-tion does not affect the estimation significantly howeverthe choice of the bandwidth is a vital issue [30 31] Thedetermination of the bandwidth will be shown in detail inSection 53
In the following we introduce the kernel regressionmodel proposed by Nadaraya [32] Theoretically we assumethat each observation is denoted as (X Y) which is a random
4 Mathematical Problems in Engineering
vector R2-valued With the sample set (xj yj)| j = 1 2119899 the kernel estimator 119910 of the target y given its predictiveobservation x is defined as
119910 = 119899sum119895=1
[[
119870((119909 minus 119909119895) ℎ)sum119899119895=1119870((119909 minus 119909119895) ℎ) sdot 119910119895
]] (4)
where K(sdot) is a kernel function and h is the bandwidthFor the instance-based credit risk modeling the set of
historical observations is represented as (pj Rj)| j = 1 2119899 where pj and Rj are the default probability and return rateof the jth loan respectively Thereby the estimation of the ithloanrsquos return could be written as
120583119894 =119899sum119895=1
[[
119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) sdot 119877119895
]] (5)
Note that the determination of loansrsquo default probability willbe introduced in Section 51
Comparing (1) to (5) we can represent the optimal weight119908119894119895 as
119908119894119895 = 119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) (6)
Using the optimal weight 119908119894119895 and the expected return 120583119894derived from (5) (2) can be rewritten as
2119894 =119899sum119895=1
[[
119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) sdot (119877119895 minus 120583119894)
2]] (7)
4 Robust Investment Decision Model
Similar to bond investment P2P lenders can invest a portionof each loan Thus P2P loan investment decisions can betransformed into a credit portfolio optimization problemThis section introduces the portfolio optimization model forinvestment decisions in P2P lending which accounts for theuncertainty of the distribution of the loans We start fromthe classical mean-variance optimization model proposed byMarkowitz [1] to its tractable robust counterpart
41 Robust Optimization Model Based on Relative EntropyConstraints In the classical mean-variance optimizationmodel the optimal asset allocation strategy is identified bysolving the tradeoff between risk and return according toinvestorsrsquo risk preference A portfolio that invests in n assets isrepresented as a vector of weights 120582 isin Rn where each weightdenotes the proportion of wealth allocated to an asset Thenthe return and risk of the portfolio become 120582T120583 and 120582T119881120582respectively where 120583 isin Rn and V isin Rntimesn are the expectedreturn and the covariance matrix of the assetsrsquo returnsunder the probability measure (or probability distribution)P respectively Here P represents the ideal estimated marketcondition where 120583 and V estimated by using all availableinformation including historical observations news expert
knowledge and so on are assumed as the actual expectedreturn and covariance matrix Thus the classical mean-variance portfolio selection problem (MV) can be formulatedas
(MV) min120582
120582T119881120582st 120582T120583 ge 119877lowast
120582 isin Ω(8)
whereΩ sube Rn denotes the set of feasible portfolios and 119877lowast isthe required return rate specified by the investor
In reality the assumption that the expected return 120583and covariance matrix V are known with certainty is lessreasonable It is quite possible that the estimated parametersare different with the actual ones Thus the optimal portfolioidentified by using the estimated inputs parameters 120583 andV directly may be inappropriate Robust optimization seeksfor portfolios that are insensitive to the uncertain in theparameters and the solutions that must be feasible no matterwhat the actual value of the parameters is
The investors might consider a set of probability mea-sures ie an uncertainty set to cover a range of scenariosbased on their assessments and then use robust optimizationto obtain approximate optimal strategies for the worst sce-narios within the uncertainty set In this paper we define Qas the set of probability measures representing the possiblescenarios 120583119876 and 119881119876 as the expected return and covariancematrix estimated under the probability measure 119876 isin QMathematically the robust counterpart of the classical mean-variance optimization problem (RMV) can be written as
(RMV) min120582
sup119876isinQ
120582T119881119876120582st inf
119876isinQ120582T120583119876 ge 119877lowast
120582 isin Ω(9)
It is rational to assume that the actual value of the parametersis in the neighborhood of the estimatorThus we can generatethe uncertainty set Q based on the assumption that themeasures in the set should be not far from the ideal measureP Relative entropy also known as the KullbackndashLeiblerdivergence can be used to measure the difference betweenprobability measures The relative entropy of the measure 119876in Q with respect to the measure P is
119863119870119871 (119876 119875) fl int119902 (119909) ln 119902 (119909)119901 (119909)119889119909 (10)
where 119901(119909) and 119902(119909) are the probability density functions(pdf) of the loansrsquo returns under probability measures P and119876 respectively In the context of mean-variance analysisrelative entropy 119863119870119871(119876 119875) can be rewritten as
119863119870119871 (119876 119875) = 12 [ln |119881| minus ln 10038161003816100381610038161198811198761003816100381610038161003816 + tr (119881minus1119881119876) minus 119899+ (120583 minus 120583119876)T119881minus1 (120583 minus 120583119876)]
(11)
Mathematical Problems in Engineering 5
where 120583 V 120583119876 and 119881119876 carry the same meaning as in (8) and(9) tr(V) |119881| and V be the trace the determinant and thetranspose of V respectively n is the amount of assets in theportfolio
Let U denote the set of parameters (120583119876 119881119876) under themeasure Q in Q Using the constraint of relative entropy wecan rewrite the robust optimization model (9) as
(RMV-RE) min120582
max(120583119876119881119876)isinU
120582T119881119876120582st min
(120583119876119881119876)isinU120582T120583119876 ge 119877lowast
119863119870119871 (119876 119875) le 119870120582 isin Ω
(12)
where K is a positive constant and determines the size ofuncertainty set Parameter K measures the level of uncer-tainty and reflects the investorsrsquo confidence in 120583 and Vestimated under probability measure P ie the greater Krsquosvalue the less confidence
Yam et al [6] prove that the robustmean-variance portfo-lio selection model based on relative entropy method (RMV-RE) can be formulated as quadratic optimization problemwhich is a tractable formulation and can be efficiently solvedThat is
min120582isinR119899
120582T119881 lowast 120582st 120582T120583lowast ge 119877lowast
120582 isin Ω(13)
Herein 120583lowast=120577120583 Vlowast=V+120577(1-120577)120583120583T and 120577 isin (0 1] is relatedto K in (12) closely which reflects the level of confidencein 120583 and V estimated under measure P For example 120577=1means that investors believe the estimated 120583 and V are thetrue parameters And as 120577 decreases the investorrsquos confidenceis weaker The details of the proof are referred to by Yam et al[6]
42 Robust Mean-Variance Portfolio Optimization Model inP2P Lending In the Section 32 we estimated each loanrsquosexpected return and variance of return ie 120583119894 and 120590119894 usingthe instance-based credit risk assessment model Let 120583 =(1205831 1205832 120583119899)T and
=[[[[[[[[[
1 0 00 2 d
d d 00 0 120590119899
]]]]]]]]]
(14)
denote the expected return vector and the covariance matrixof the loansrsquo returns under the probability measure P Herewe assume that the correlation between P2P loans is negligi-ble Now we can rewrite (13) as
Table 1 Description of variables
Variable DescriptionX1 FICO score of the borrower
X2The number of inquiries of the borrower in the last 6
monthsX3 Themonetary amount of the loan
X4The homeownership status of the borrower (0 = rent 1
= own)X5 The debt-to-income ratio of the borrowerX6 The number of accounts delinquentX7 The number of public records in the past 10 yearsY Dependent variable (0 = completed 1 = default)
min120582isinR119899
120582T ( + 120577 (1 minus 120577) 120583120583119879) 120582st 120582T (120577120583) ge 119877lowast
120582 isin Ω(15)
The feasible region Ω of our problem is defined by thefollowing constraints
(1) The value of the portfolio remains at its initial valueiesum119894 120582119894 = 1
(2) Short-selling is forbidden thus 120582119894 ge 0(3) For each loan the amount that lender can invest is
no more than the borrower request mi thereby 120582119894Mle mi where M is the total investment amount andinvestor has available
5 Empirical Analysis
In this section we investigate the validity of the robustmean-variance portfolio optimization model in P2P lending usingthe real-world dataset from a notable P2P lending platformProsper All numerical experiments are performed by usingMATLAB on PC
51 Data Description and Preprocess The dataset for empir-ical study is from a notable P2P lending platform in theUnited States Prosper It consists of 17001 loans including3039 default loans and 13908 completed loans whose issuedates within the period from November 2005 to March 2014
Using the data a credit scoring model is learnt to trans-form the loan attributes into the default probability The loanattributes are as follows the borrowerrsquos FICO score whichreflects borrowerrsquos creditworthiness the borrowerrsquos numberof inquiries in the past six months the monetary amountof the loan the homeownership status of the borrowerthe debt-to-income ratio of the borrower the borrowerrsquoscurrent delinquencies representing the number of accountsdelinquent and the borrowerrsquos number of public records inthe past 10 years (Row 1-7 in Table 1) The target variable isa binary variable (0 represents completed and 1 representsdefault) as described in Row 8 of Table 1
6 Mathematical Problems in Engineering
009500955
009600965
009700975
009800985
009900995
01
CV (h
)002 004 006 008 01 012 014 016 018 020
h
Figure 1 The curve of CV (h)
There exist many credit scoring models to predict thedefault probability of a loan such as Xgboost model [33ndash35] hybrid KMVmodel [36] credit scoring based on geneticalgorithms [37 38] and so on However discussing how tochoose and construct the optimal credit scoring model isbeyond the scope of this study and we use the most popularmodel logistic regression to make the prediction in thispreprocessing step
We randomly divide the dataset into two parts onecontaining 40 of all loans for determining the optimalbandwidth h in (5) which will be described in detail inSection 53 and the second part containing 60 of the loansMoreover using k-fold cross-validation we randomly dividethe second part into 20 subsets each of which containsapproximately 510 loans In each round one of the subsetsis used as the testing set which consists of loans waiting tobe invested thus their pay-back statuses are unknown andall other subsets are taken as a training set which consists ofhistorical loans with known yield
52 Model Description In this paper we propose a robustcredit portfolio optimization model for investment decisionsin P2P lending In order to show its effectiveness we compareit with a benchmark model proposed by Guo et al [5] In thefollowing we describe models in detail
IOM is the instance-based model proposed by Guo etal [5] Each loan is assessed using kernel weights and thehistorical performance of similar loans Then use the classicalmean-variance model (8) to identify the optimal allocationstrategy The performance of this model outperforms somerating-based models as the results of Guo et al [5] show
RIOM is the robust instance-based model in this studyExpected return and risk of each loan are also assessed basedon the ldquoinstance-basedrdquo assessment framework However weuse the robust model of credit portfolio optimization basedon relative entropy method Equation (15) to obtain theoptimal investment decision
We compare the two models by the following procedure(1) Train the credit risk assessment model with the
training set and use the trained model to predict theexpected return (120583119894) and variance (120590119894) of each loan inthe testing set Thus the expected return vector andthe covariance matrix 120583 and V can be obtained
(2) For each model feed the predicted expected returnvector 120583 and the covariance matrix 119881 of the testingloans into the portfolio optimization algorithm andcompute the performance of investment on the opti-mal portfolio
(3) Compare the return rate of the two models
53 Analysis of Results As mentioned before we select theGaussian kernel 119870(120577) = (1radic2120587)119890minus12057722 as the kernel func-tion And the important parameter in the kernel regressionmodel bandwidth h is optimized by the following leave-one-out cross validation
ℎ119900119901119905119894119898119886119897 = argminℎ119862119881 (ℎ)
= argminℎ
119899sum119894=1
(120583ℎ (119901minus119894) minus 120583119894)2 (16)
where 120583ℎ(119901minus119894) is the leave-one-out estimation of expectedreturn rate 120583119894 specifically
120583ℎ (119901minus119894) =119899sum119895=1119895 =119894
[[
119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119895 =119894119870((119901119894 minus 119901119895) ℎ) sdot 119877119895
]] (17)
The curve of CV(h) is exhibited in Figure 1 The shape of thecurve clearly shows a minimal point and h corresponding tothe minimal point is the optimal bandwith for the model
To apply the robust credit portfolio optimization methodto obtain the optimal investment strategy in problems (13)we select the parameter 120577=075 the investment amount M =15 thousand dollars and the required rate of return 119877lowast = 005We also set the risk-free return rate as 0025 which is aboutequivalent to the average yield of T-Bills over the sameperiodAnd we use the MATLAB built-in solver ldquoquaprogrdquo to solvethe two portfolio optimization problems
Table 2 summarizes investment return rate of each testsubset and the average performance of the Prosper dataset Itshows that the two portfolios are almost always efficient andfeasible except subset 16The results also show that the actualperformances of the optimal portfolio derived from RIOMalways outperform the optimal portfolio from IOM Andthe Sharpe ratio shows that median-based optimal portfolioperforms better as well
Mathematical Problems in Engineering 7
1 1098765432The number of parameters set
IOMRIOM
0001002003004005006007008009
Retu
rn ra
te o
f inv
estm
ent
Figure 2 Performance comparison
Table 2 Rate of return from the optimal portfolio on the Prosperdataset
Subset IOM RIOM1 00501 005662 00550 006333 00540 006184 00564 006965 00627 007146 00543 006297 00532 006888 00605 007119 00593 0070610 00546 0066411 00637 0070112 00567 0064013 00468 0056914 00519 0066315 00544 0062016 00357 0047217 00588 0071018 00607 0077419 00544 0065520 00625 00808Average 00553 00662
In order to test and verify that the conclusions obtainedfrom the above experiments are stable we consider dif-ferent investment amounts and required returns as inputparameters for portfolio selection and keep other conditionsunchanged As summarized in Table 3 we consider nineparameters pairs about required return rate 119877lowast and invest-ment amount M
The computational results for each parameters pair aresummarized in Table 4 Table 4 shows performance compar-ison of the two optimal portfolios from the perspectives ofactual return rate of investment The more intuitive resultsare shown in Figure 2 which shows the actual return ratecomparison of the two models The first 9 numbers ofthe horizontal axis in Figure 2 represent the correspondingparameters combinations (sets 1 through 9 fromTable 3) and
Table 3 Investorsrsquo choices of input parameters for portfolio selec-tion
Set Investment amountM Required rate 119877lowast1 $10000 502 $10000 553 $10000 604 $15000 505 $15000 556 $15000 607 $20000 508 $20000 559 $20000 60
the number 10 shows the average We can find that the RIOMmodel outperforms the IOMmodel comprehensively
In conclusion the optimal portfolio identified from therobust optimization model in this study is more efficient thanthe existing model And the performance of our model ismore robust and stable
6 Conclusions
In this paper we formulate a data-driven robust modelof portfolio optimization with relative entropy constraintsbased on an instance-based credit risk assessment frameworkfor investment decisions in P2P lending This P2P lendinginvestment decision model has at least three advantagesFirstly it provides a more refined measure of P2P loansrsquo riskand reveals a more intuitive and quantized risk estimate toinvestors instead of just labelling each loan with a creditgrade Secondly this model can estimate each loanrsquos expectedreturn and risk when the historical observation of the sameborrower is unavailable Finally this model considers theloansrsquo distribution ambiguity (probability measure uncer-tainty) problem and uses relative entropy tomodel parameteruncertainty to ensure the optimal allocation strategy effi-cient and feasible under various actual scenarios Numericalexperiments imply that the P2P lending investment decisionmodel using the robust optimization with relative entropyconstraints provides better performance than existing model
8 Mathematical Problems in Engineering
Table4Investm
entp
erform
anceso
finp
utparametersfor
portfolio
selection
Subset
119877lowast=5
119877lowast
=55
119877lowast
=6
119877lowast=5
119877lowast
=55
119877lowast
=6
119877lowast
=5
119877lowast =
55
119877lowast =
6
M=10000
M=10000
M=10000
M=15000
M=15000
M=15000
M=20000
M=20000
M=20000
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
100598
007
27006
0100762
00502
007
2200501
005
6600558
008
3900520
008
3300594
007
7400544
006
8900691
006
492
00500
005
92006
01007
79006
75008
5100550
006
3300517
006
2000551
006
4800504
005
84006
64008
95006
6100769
3004
41005
7100491
006
9800735
009
3400540
006
1800598
007
7800631
006
4600503
005
6800554
007
37006
47008
694
00525
006
0200658
009
0300636
007
5400564
006
9600512
006
2900553
008
4700566
006
4800617
008
5600518
006
355
00532
006
2000631
008
2900513
007
3100627
007
1400566
007
2100616
008
9600576
007
0900547
006
9200610
007
716
00634
00747
00564
00762
00717
01105
00543
006
2900570
007
7200585
008
7400584
00 6
8 300528
007
0100516
003
857
00613
007
3600547
007
5400551
008
8400532
006
8800528
007
2700620
005
66004
81005
49004
85006
31004
60007
598
00529
005
9400505
006
16006
85008
58006
05007
1100545
00768
006
45008
5500545
007
0100628
008
2300592
008
459
00548
006
4700550
007
3600559
004
9300593
007
0600507
005
7600574
01214
00535
005
8600561
00764
00574
01038
10004
74005
74004
72006
3400499
007
9400546
006
6400528
006
3400622
006
3100514
005
9700582
006
83006
8900532
1100597
007
30006
02007
95006
6101090
00637
007
0100562
007
5600498
006
6200531
005
8400569
006
5700572
01141
12006
4400768
00541
006
7300624
01 042
00567
006
4000529
006
7700574
009
8300551
006
7800536
007
3400618
006
8713
00635
007
8500709
008
9500532
006
62004
68005
6900637
008
8000504
009
1300555
006
9500636
008
2400616
01157
1400593
00744
00626
007
5100634
01204
00519
006
6300568
007
1600614
01162
00577
006
7400541
006
3600572
008
1815
00523
006
36004
85006
0900571
009
8700544
006
2000577
00764
00633
008
0200597
007
7500536
007
0600595
007
0416
00549
007
05006
84008
9300508
01264
00357
004
72006
42008
5100573
005
4900593
006
9800616
008
0700551
00748
1700549
006
6600549
007
5700538
006
7700588
007
10006
74008
6700615
004
9600535
006
41004
87006
3600 69 6
009
1518
00546
006
2900512
006
1500560
006
10006
07007
7400585
007
32006
87008
2400599
007
2900576
00746
00507
01069
1900492
005
5500572
006
8500657
004
3600544
006
5500434
006
3300589
007
5900581
006
73004
72006
3800623
01148
2000554
006
45004
13005
0400596
003
6600625
008
0800562
006
8700698
009
7800518
007
09006
01007
3400638
00744
Average
00554
006
6400566
007
3200598
008
2300553
006
6200560
007
3100595
008
0700552
006
6300564
007
2900597
008
19
Mathematical Problems in Engineering 9
Data Availability
The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx
Conflicts of Interest
The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo
Acknowledgments
The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above
References
[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952
[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959
[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002
[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002
[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016
[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016
[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014
[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011
[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968
[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970
[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980
[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983
[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011
[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007
[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000
[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006
[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007
[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015
[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018
[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009
[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011
[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010
[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010
[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014
[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001
[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007
[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018
[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018
[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007
[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004
[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013
[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965
10 Mathematical Problems in Engineering
[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015
[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017
[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018
[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012
[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012
[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
4 Mathematical Problems in Engineering
vector R2-valued With the sample set (xj yj)| j = 1 2119899 the kernel estimator 119910 of the target y given its predictiveobservation x is defined as
119910 = 119899sum119895=1
[[
119870((119909 minus 119909119895) ℎ)sum119899119895=1119870((119909 minus 119909119895) ℎ) sdot 119910119895
]] (4)
where K(sdot) is a kernel function and h is the bandwidthFor the instance-based credit risk modeling the set of
historical observations is represented as (pj Rj)| j = 1 2119899 where pj and Rj are the default probability and return rateof the jth loan respectively Thereby the estimation of the ithloanrsquos return could be written as
120583119894 =119899sum119895=1
[[
119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) sdot 119877119895
]] (5)
Note that the determination of loansrsquo default probability willbe introduced in Section 51
Comparing (1) to (5) we can represent the optimal weight119908119894119895 as
119908119894119895 = 119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) (6)
Using the optimal weight 119908119894119895 and the expected return 120583119894derived from (5) (2) can be rewritten as
2119894 =119899sum119895=1
[[
119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119870((119901119894 minus 119901119895) ℎ) sdot (119877119895 minus 120583119894)
2]] (7)
4 Robust Investment Decision Model
Similar to bond investment P2P lenders can invest a portionof each loan Thus P2P loan investment decisions can betransformed into a credit portfolio optimization problemThis section introduces the portfolio optimization model forinvestment decisions in P2P lending which accounts for theuncertainty of the distribution of the loans We start fromthe classical mean-variance optimization model proposed byMarkowitz [1] to its tractable robust counterpart
41 Robust Optimization Model Based on Relative EntropyConstraints In the classical mean-variance optimizationmodel the optimal asset allocation strategy is identified bysolving the tradeoff between risk and return according toinvestorsrsquo risk preference A portfolio that invests in n assets isrepresented as a vector of weights 120582 isin Rn where each weightdenotes the proportion of wealth allocated to an asset Thenthe return and risk of the portfolio become 120582T120583 and 120582T119881120582respectively where 120583 isin Rn and V isin Rntimesn are the expectedreturn and the covariance matrix of the assetsrsquo returnsunder the probability measure (or probability distribution)P respectively Here P represents the ideal estimated marketcondition where 120583 and V estimated by using all availableinformation including historical observations news expert
knowledge and so on are assumed as the actual expectedreturn and covariance matrix Thus the classical mean-variance portfolio selection problem (MV) can be formulatedas
(MV) min120582
120582T119881120582st 120582T120583 ge 119877lowast
120582 isin Ω(8)
whereΩ sube Rn denotes the set of feasible portfolios and 119877lowast isthe required return rate specified by the investor
In reality the assumption that the expected return 120583and covariance matrix V are known with certainty is lessreasonable It is quite possible that the estimated parametersare different with the actual ones Thus the optimal portfolioidentified by using the estimated inputs parameters 120583 andV directly may be inappropriate Robust optimization seeksfor portfolios that are insensitive to the uncertain in theparameters and the solutions that must be feasible no matterwhat the actual value of the parameters is
The investors might consider a set of probability mea-sures ie an uncertainty set to cover a range of scenariosbased on their assessments and then use robust optimizationto obtain approximate optimal strategies for the worst sce-narios within the uncertainty set In this paper we define Qas the set of probability measures representing the possiblescenarios 120583119876 and 119881119876 as the expected return and covariancematrix estimated under the probability measure 119876 isin QMathematically the robust counterpart of the classical mean-variance optimization problem (RMV) can be written as
(RMV) min120582
sup119876isinQ
120582T119881119876120582st inf
119876isinQ120582T120583119876 ge 119877lowast
120582 isin Ω(9)
It is rational to assume that the actual value of the parametersis in the neighborhood of the estimatorThus we can generatethe uncertainty set Q based on the assumption that themeasures in the set should be not far from the ideal measureP Relative entropy also known as the KullbackndashLeiblerdivergence can be used to measure the difference betweenprobability measures The relative entropy of the measure 119876in Q with respect to the measure P is
119863119870119871 (119876 119875) fl int119902 (119909) ln 119902 (119909)119901 (119909)119889119909 (10)
where 119901(119909) and 119902(119909) are the probability density functions(pdf) of the loansrsquo returns under probability measures P and119876 respectively In the context of mean-variance analysisrelative entropy 119863119870119871(119876 119875) can be rewritten as
119863119870119871 (119876 119875) = 12 [ln |119881| minus ln 10038161003816100381610038161198811198761003816100381610038161003816 + tr (119881minus1119881119876) minus 119899+ (120583 minus 120583119876)T119881minus1 (120583 minus 120583119876)]
(11)
Mathematical Problems in Engineering 5
where 120583 V 120583119876 and 119881119876 carry the same meaning as in (8) and(9) tr(V) |119881| and V be the trace the determinant and thetranspose of V respectively n is the amount of assets in theportfolio
Let U denote the set of parameters (120583119876 119881119876) under themeasure Q in Q Using the constraint of relative entropy wecan rewrite the robust optimization model (9) as
(RMV-RE) min120582
max(120583119876119881119876)isinU
120582T119881119876120582st min
(120583119876119881119876)isinU120582T120583119876 ge 119877lowast
119863119870119871 (119876 119875) le 119870120582 isin Ω
(12)
where K is a positive constant and determines the size ofuncertainty set Parameter K measures the level of uncer-tainty and reflects the investorsrsquo confidence in 120583 and Vestimated under probability measure P ie the greater Krsquosvalue the less confidence
Yam et al [6] prove that the robustmean-variance portfo-lio selection model based on relative entropy method (RMV-RE) can be formulated as quadratic optimization problemwhich is a tractable formulation and can be efficiently solvedThat is
min120582isinR119899
120582T119881 lowast 120582st 120582T120583lowast ge 119877lowast
120582 isin Ω(13)
Herein 120583lowast=120577120583 Vlowast=V+120577(1-120577)120583120583T and 120577 isin (0 1] is relatedto K in (12) closely which reflects the level of confidencein 120583 and V estimated under measure P For example 120577=1means that investors believe the estimated 120583 and V are thetrue parameters And as 120577 decreases the investorrsquos confidenceis weaker The details of the proof are referred to by Yam et al[6]
42 Robust Mean-Variance Portfolio Optimization Model inP2P Lending In the Section 32 we estimated each loanrsquosexpected return and variance of return ie 120583119894 and 120590119894 usingthe instance-based credit risk assessment model Let 120583 =(1205831 1205832 120583119899)T and
=[[[[[[[[[
1 0 00 2 d
d d 00 0 120590119899
]]]]]]]]]
(14)
denote the expected return vector and the covariance matrixof the loansrsquo returns under the probability measure P Herewe assume that the correlation between P2P loans is negligi-ble Now we can rewrite (13) as
Table 1 Description of variables
Variable DescriptionX1 FICO score of the borrower
X2The number of inquiries of the borrower in the last 6
monthsX3 Themonetary amount of the loan
X4The homeownership status of the borrower (0 = rent 1
= own)X5 The debt-to-income ratio of the borrowerX6 The number of accounts delinquentX7 The number of public records in the past 10 yearsY Dependent variable (0 = completed 1 = default)
min120582isinR119899
120582T ( + 120577 (1 minus 120577) 120583120583119879) 120582st 120582T (120577120583) ge 119877lowast
120582 isin Ω(15)
The feasible region Ω of our problem is defined by thefollowing constraints
(1) The value of the portfolio remains at its initial valueiesum119894 120582119894 = 1
(2) Short-selling is forbidden thus 120582119894 ge 0(3) For each loan the amount that lender can invest is
no more than the borrower request mi thereby 120582119894Mle mi where M is the total investment amount andinvestor has available
5 Empirical Analysis
In this section we investigate the validity of the robustmean-variance portfolio optimization model in P2P lending usingthe real-world dataset from a notable P2P lending platformProsper All numerical experiments are performed by usingMATLAB on PC
51 Data Description and Preprocess The dataset for empir-ical study is from a notable P2P lending platform in theUnited States Prosper It consists of 17001 loans including3039 default loans and 13908 completed loans whose issuedates within the period from November 2005 to March 2014
Using the data a credit scoring model is learnt to trans-form the loan attributes into the default probability The loanattributes are as follows the borrowerrsquos FICO score whichreflects borrowerrsquos creditworthiness the borrowerrsquos numberof inquiries in the past six months the monetary amountof the loan the homeownership status of the borrowerthe debt-to-income ratio of the borrower the borrowerrsquoscurrent delinquencies representing the number of accountsdelinquent and the borrowerrsquos number of public records inthe past 10 years (Row 1-7 in Table 1) The target variable isa binary variable (0 represents completed and 1 representsdefault) as described in Row 8 of Table 1
6 Mathematical Problems in Engineering
009500955
009600965
009700975
009800985
009900995
01
CV (h
)002 004 006 008 01 012 014 016 018 020
h
Figure 1 The curve of CV (h)
There exist many credit scoring models to predict thedefault probability of a loan such as Xgboost model [33ndash35] hybrid KMVmodel [36] credit scoring based on geneticalgorithms [37 38] and so on However discussing how tochoose and construct the optimal credit scoring model isbeyond the scope of this study and we use the most popularmodel logistic regression to make the prediction in thispreprocessing step
We randomly divide the dataset into two parts onecontaining 40 of all loans for determining the optimalbandwidth h in (5) which will be described in detail inSection 53 and the second part containing 60 of the loansMoreover using k-fold cross-validation we randomly dividethe second part into 20 subsets each of which containsapproximately 510 loans In each round one of the subsetsis used as the testing set which consists of loans waiting tobe invested thus their pay-back statuses are unknown andall other subsets are taken as a training set which consists ofhistorical loans with known yield
52 Model Description In this paper we propose a robustcredit portfolio optimization model for investment decisionsin P2P lending In order to show its effectiveness we compareit with a benchmark model proposed by Guo et al [5] In thefollowing we describe models in detail
IOM is the instance-based model proposed by Guo etal [5] Each loan is assessed using kernel weights and thehistorical performance of similar loans Then use the classicalmean-variance model (8) to identify the optimal allocationstrategy The performance of this model outperforms somerating-based models as the results of Guo et al [5] show
RIOM is the robust instance-based model in this studyExpected return and risk of each loan are also assessed basedon the ldquoinstance-basedrdquo assessment framework However weuse the robust model of credit portfolio optimization basedon relative entropy method Equation (15) to obtain theoptimal investment decision
We compare the two models by the following procedure(1) Train the credit risk assessment model with the
training set and use the trained model to predict theexpected return (120583119894) and variance (120590119894) of each loan inthe testing set Thus the expected return vector andthe covariance matrix 120583 and V can be obtained
(2) For each model feed the predicted expected returnvector 120583 and the covariance matrix 119881 of the testingloans into the portfolio optimization algorithm andcompute the performance of investment on the opti-mal portfolio
(3) Compare the return rate of the two models
53 Analysis of Results As mentioned before we select theGaussian kernel 119870(120577) = (1radic2120587)119890minus12057722 as the kernel func-tion And the important parameter in the kernel regressionmodel bandwidth h is optimized by the following leave-one-out cross validation
ℎ119900119901119905119894119898119886119897 = argminℎ119862119881 (ℎ)
= argminℎ
119899sum119894=1
(120583ℎ (119901minus119894) minus 120583119894)2 (16)
where 120583ℎ(119901minus119894) is the leave-one-out estimation of expectedreturn rate 120583119894 specifically
120583ℎ (119901minus119894) =119899sum119895=1119895 =119894
[[
119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119895 =119894119870((119901119894 minus 119901119895) ℎ) sdot 119877119895
]] (17)
The curve of CV(h) is exhibited in Figure 1 The shape of thecurve clearly shows a minimal point and h corresponding tothe minimal point is the optimal bandwith for the model
To apply the robust credit portfolio optimization methodto obtain the optimal investment strategy in problems (13)we select the parameter 120577=075 the investment amount M =15 thousand dollars and the required rate of return 119877lowast = 005We also set the risk-free return rate as 0025 which is aboutequivalent to the average yield of T-Bills over the sameperiodAnd we use the MATLAB built-in solver ldquoquaprogrdquo to solvethe two portfolio optimization problems
Table 2 summarizes investment return rate of each testsubset and the average performance of the Prosper dataset Itshows that the two portfolios are almost always efficient andfeasible except subset 16The results also show that the actualperformances of the optimal portfolio derived from RIOMalways outperform the optimal portfolio from IOM Andthe Sharpe ratio shows that median-based optimal portfolioperforms better as well
Mathematical Problems in Engineering 7
1 1098765432The number of parameters set
IOMRIOM
0001002003004005006007008009
Retu
rn ra
te o
f inv
estm
ent
Figure 2 Performance comparison
Table 2 Rate of return from the optimal portfolio on the Prosperdataset
Subset IOM RIOM1 00501 005662 00550 006333 00540 006184 00564 006965 00627 007146 00543 006297 00532 006888 00605 007119 00593 0070610 00546 0066411 00637 0070112 00567 0064013 00468 0056914 00519 0066315 00544 0062016 00357 0047217 00588 0071018 00607 0077419 00544 0065520 00625 00808Average 00553 00662
In order to test and verify that the conclusions obtainedfrom the above experiments are stable we consider dif-ferent investment amounts and required returns as inputparameters for portfolio selection and keep other conditionsunchanged As summarized in Table 3 we consider nineparameters pairs about required return rate 119877lowast and invest-ment amount M
The computational results for each parameters pair aresummarized in Table 4 Table 4 shows performance compar-ison of the two optimal portfolios from the perspectives ofactual return rate of investment The more intuitive resultsare shown in Figure 2 which shows the actual return ratecomparison of the two models The first 9 numbers ofthe horizontal axis in Figure 2 represent the correspondingparameters combinations (sets 1 through 9 fromTable 3) and
Table 3 Investorsrsquo choices of input parameters for portfolio selec-tion
Set Investment amountM Required rate 119877lowast1 $10000 502 $10000 553 $10000 604 $15000 505 $15000 556 $15000 607 $20000 508 $20000 559 $20000 60
the number 10 shows the average We can find that the RIOMmodel outperforms the IOMmodel comprehensively
In conclusion the optimal portfolio identified from therobust optimization model in this study is more efficient thanthe existing model And the performance of our model ismore robust and stable
6 Conclusions
In this paper we formulate a data-driven robust modelof portfolio optimization with relative entropy constraintsbased on an instance-based credit risk assessment frameworkfor investment decisions in P2P lending This P2P lendinginvestment decision model has at least three advantagesFirstly it provides a more refined measure of P2P loansrsquo riskand reveals a more intuitive and quantized risk estimate toinvestors instead of just labelling each loan with a creditgrade Secondly this model can estimate each loanrsquos expectedreturn and risk when the historical observation of the sameborrower is unavailable Finally this model considers theloansrsquo distribution ambiguity (probability measure uncer-tainty) problem and uses relative entropy tomodel parameteruncertainty to ensure the optimal allocation strategy effi-cient and feasible under various actual scenarios Numericalexperiments imply that the P2P lending investment decisionmodel using the robust optimization with relative entropyconstraints provides better performance than existing model
8 Mathematical Problems in Engineering
Table4Investm
entp
erform
anceso
finp
utparametersfor
portfolio
selection
Subset
119877lowast=5
119877lowast
=55
119877lowast
=6
119877lowast=5
119877lowast
=55
119877lowast
=6
119877lowast
=5
119877lowast =
55
119877lowast =
6
M=10000
M=10000
M=10000
M=15000
M=15000
M=15000
M=20000
M=20000
M=20000
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
100598
007
27006
0100762
00502
007
2200501
005
6600558
008
3900520
008
3300594
007
7400544
006
8900691
006
492
00500
005
92006
01007
79006
75008
5100550
006
3300517
006
2000551
006
4800504
005
84006
64008
95006
6100769
3004
41005
7100491
006
9800735
009
3400540
006
1800598
007
7800631
006
4600503
005
6800554
007
37006
47008
694
00525
006
0200658
009
0300636
007
5400564
006
9600512
006
2900553
008
4700566
006
4800617
008
5600518
006
355
00532
006
2000631
008
2900513
007
3100627
007
1400566
007
2100616
008
9600576
007
0900547
006
9200610
007
716
00634
00747
00564
00762
00717
01105
00543
006
2900570
007
7200585
008
7400584
00 6
8 300528
007
0100516
003
857
00613
007
3600547
007
5400551
008
8400532
006
8800528
007
2700620
005
66004
81005
49004
85006
31004
60007
598
00529
005
9400505
006
16006
85008
58006
05007
1100545
00768
006
45008
5500545
007
0100628
008
2300592
008
459
00548
006
4700550
007
3600559
004
9300593
007
0600507
005
7600574
01214
00535
005
8600561
00764
00574
01038
10004
74005
74004
72006
3400499
007
9400546
006
6400528
006
3400622
006
3100514
005
9700582
006
83006
8900532
1100597
007
30006
02007
95006
6101090
00637
007
0100562
007
5600498
006
6200531
005
8400569
006
5700572
01141
12006
4400768
00541
006
7300624
01 042
00567
006
4000529
006
7700574
009
8300551
006
7800536
007
3400618
006
8713
00635
007
8500709
008
9500532
006
62004
68005
6900637
008
8000504
009
1300555
006
9500636
008
2400616
01157
1400593
00744
00626
007
5100634
01204
00519
006
6300568
007
1600614
01162
00577
006
7400541
006
3600572
008
1815
00523
006
36004
85006
0900571
009
8700544
006
2000577
00764
00633
008
0200597
007
7500536
007
0600595
007
0416
00549
007
05006
84008
9300508
01264
00357
004
72006
42008
5100573
005
4900593
006
9800616
008
0700551
00748
1700549
006
6600549
007
5700538
006
7700588
007
10006
74008
6700615
004
9600535
006
41004
87006
3600 69 6
009
1518
00546
006
2900512
006
1500560
006
10006
07007
7400585
007
32006
87008
2400599
007
2900576
00746
00507
01069
1900492
005
5500572
006
8500657
004
3600544
006
5500434
006
3300589
007
5900581
006
73004
72006
3800623
01148
2000554
006
45004
13005
0400596
003
6600625
008
0800562
006
8700698
009
7800518
007
09006
01007
3400638
00744
Average
00554
006
6400566
007
3200598
008
2300553
006
6200560
007
3100595
008
0700552
006
6300564
007
2900597
008
19
Mathematical Problems in Engineering 9
Data Availability
The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx
Conflicts of Interest
The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo
Acknowledgments
The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above
References
[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952
[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959
[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002
[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002
[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016
[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016
[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014
[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011
[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968
[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970
[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980
[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983
[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011
[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007
[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000
[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006
[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007
[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015
[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018
[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009
[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011
[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010
[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010
[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014
[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001
[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007
[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018
[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018
[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007
[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004
[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013
[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965
10 Mathematical Problems in Engineering
[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015
[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017
[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018
[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012
[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012
[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
Mathematical Problems in Engineering 5
where 120583 V 120583119876 and 119881119876 carry the same meaning as in (8) and(9) tr(V) |119881| and V be the trace the determinant and thetranspose of V respectively n is the amount of assets in theportfolio
Let U denote the set of parameters (120583119876 119881119876) under themeasure Q in Q Using the constraint of relative entropy wecan rewrite the robust optimization model (9) as
(RMV-RE) min120582
max(120583119876119881119876)isinU
120582T119881119876120582st min
(120583119876119881119876)isinU120582T120583119876 ge 119877lowast
119863119870119871 (119876 119875) le 119870120582 isin Ω
(12)
where K is a positive constant and determines the size ofuncertainty set Parameter K measures the level of uncer-tainty and reflects the investorsrsquo confidence in 120583 and Vestimated under probability measure P ie the greater Krsquosvalue the less confidence
Yam et al [6] prove that the robustmean-variance portfo-lio selection model based on relative entropy method (RMV-RE) can be formulated as quadratic optimization problemwhich is a tractable formulation and can be efficiently solvedThat is
min120582isinR119899
120582T119881 lowast 120582st 120582T120583lowast ge 119877lowast
120582 isin Ω(13)
Herein 120583lowast=120577120583 Vlowast=V+120577(1-120577)120583120583T and 120577 isin (0 1] is relatedto K in (12) closely which reflects the level of confidencein 120583 and V estimated under measure P For example 120577=1means that investors believe the estimated 120583 and V are thetrue parameters And as 120577 decreases the investorrsquos confidenceis weaker The details of the proof are referred to by Yam et al[6]
42 Robust Mean-Variance Portfolio Optimization Model inP2P Lending In the Section 32 we estimated each loanrsquosexpected return and variance of return ie 120583119894 and 120590119894 usingthe instance-based credit risk assessment model Let 120583 =(1205831 1205832 120583119899)T and
=[[[[[[[[[
1 0 00 2 d
d d 00 0 120590119899
]]]]]]]]]
(14)
denote the expected return vector and the covariance matrixof the loansrsquo returns under the probability measure P Herewe assume that the correlation between P2P loans is negligi-ble Now we can rewrite (13) as
Table 1 Description of variables
Variable DescriptionX1 FICO score of the borrower
X2The number of inquiries of the borrower in the last 6
monthsX3 Themonetary amount of the loan
X4The homeownership status of the borrower (0 = rent 1
= own)X5 The debt-to-income ratio of the borrowerX6 The number of accounts delinquentX7 The number of public records in the past 10 yearsY Dependent variable (0 = completed 1 = default)
min120582isinR119899
120582T ( + 120577 (1 minus 120577) 120583120583119879) 120582st 120582T (120577120583) ge 119877lowast
120582 isin Ω(15)
The feasible region Ω of our problem is defined by thefollowing constraints
(1) The value of the portfolio remains at its initial valueiesum119894 120582119894 = 1
(2) Short-selling is forbidden thus 120582119894 ge 0(3) For each loan the amount that lender can invest is
no more than the borrower request mi thereby 120582119894Mle mi where M is the total investment amount andinvestor has available
5 Empirical Analysis
In this section we investigate the validity of the robustmean-variance portfolio optimization model in P2P lending usingthe real-world dataset from a notable P2P lending platformProsper All numerical experiments are performed by usingMATLAB on PC
51 Data Description and Preprocess The dataset for empir-ical study is from a notable P2P lending platform in theUnited States Prosper It consists of 17001 loans including3039 default loans and 13908 completed loans whose issuedates within the period from November 2005 to March 2014
Using the data a credit scoring model is learnt to trans-form the loan attributes into the default probability The loanattributes are as follows the borrowerrsquos FICO score whichreflects borrowerrsquos creditworthiness the borrowerrsquos numberof inquiries in the past six months the monetary amountof the loan the homeownership status of the borrowerthe debt-to-income ratio of the borrower the borrowerrsquoscurrent delinquencies representing the number of accountsdelinquent and the borrowerrsquos number of public records inthe past 10 years (Row 1-7 in Table 1) The target variable isa binary variable (0 represents completed and 1 representsdefault) as described in Row 8 of Table 1
6 Mathematical Problems in Engineering
009500955
009600965
009700975
009800985
009900995
01
CV (h
)002 004 006 008 01 012 014 016 018 020
h
Figure 1 The curve of CV (h)
There exist many credit scoring models to predict thedefault probability of a loan such as Xgboost model [33ndash35] hybrid KMVmodel [36] credit scoring based on geneticalgorithms [37 38] and so on However discussing how tochoose and construct the optimal credit scoring model isbeyond the scope of this study and we use the most popularmodel logistic regression to make the prediction in thispreprocessing step
We randomly divide the dataset into two parts onecontaining 40 of all loans for determining the optimalbandwidth h in (5) which will be described in detail inSection 53 and the second part containing 60 of the loansMoreover using k-fold cross-validation we randomly dividethe second part into 20 subsets each of which containsapproximately 510 loans In each round one of the subsetsis used as the testing set which consists of loans waiting tobe invested thus their pay-back statuses are unknown andall other subsets are taken as a training set which consists ofhistorical loans with known yield
52 Model Description In this paper we propose a robustcredit portfolio optimization model for investment decisionsin P2P lending In order to show its effectiveness we compareit with a benchmark model proposed by Guo et al [5] In thefollowing we describe models in detail
IOM is the instance-based model proposed by Guo etal [5] Each loan is assessed using kernel weights and thehistorical performance of similar loans Then use the classicalmean-variance model (8) to identify the optimal allocationstrategy The performance of this model outperforms somerating-based models as the results of Guo et al [5] show
RIOM is the robust instance-based model in this studyExpected return and risk of each loan are also assessed basedon the ldquoinstance-basedrdquo assessment framework However weuse the robust model of credit portfolio optimization basedon relative entropy method Equation (15) to obtain theoptimal investment decision
We compare the two models by the following procedure(1) Train the credit risk assessment model with the
training set and use the trained model to predict theexpected return (120583119894) and variance (120590119894) of each loan inthe testing set Thus the expected return vector andthe covariance matrix 120583 and V can be obtained
(2) For each model feed the predicted expected returnvector 120583 and the covariance matrix 119881 of the testingloans into the portfolio optimization algorithm andcompute the performance of investment on the opti-mal portfolio
(3) Compare the return rate of the two models
53 Analysis of Results As mentioned before we select theGaussian kernel 119870(120577) = (1radic2120587)119890minus12057722 as the kernel func-tion And the important parameter in the kernel regressionmodel bandwidth h is optimized by the following leave-one-out cross validation
ℎ119900119901119905119894119898119886119897 = argminℎ119862119881 (ℎ)
= argminℎ
119899sum119894=1
(120583ℎ (119901minus119894) minus 120583119894)2 (16)
where 120583ℎ(119901minus119894) is the leave-one-out estimation of expectedreturn rate 120583119894 specifically
120583ℎ (119901minus119894) =119899sum119895=1119895 =119894
[[
119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119895 =119894119870((119901119894 minus 119901119895) ℎ) sdot 119877119895
]] (17)
The curve of CV(h) is exhibited in Figure 1 The shape of thecurve clearly shows a minimal point and h corresponding tothe minimal point is the optimal bandwith for the model
To apply the robust credit portfolio optimization methodto obtain the optimal investment strategy in problems (13)we select the parameter 120577=075 the investment amount M =15 thousand dollars and the required rate of return 119877lowast = 005We also set the risk-free return rate as 0025 which is aboutequivalent to the average yield of T-Bills over the sameperiodAnd we use the MATLAB built-in solver ldquoquaprogrdquo to solvethe two portfolio optimization problems
Table 2 summarizes investment return rate of each testsubset and the average performance of the Prosper dataset Itshows that the two portfolios are almost always efficient andfeasible except subset 16The results also show that the actualperformances of the optimal portfolio derived from RIOMalways outperform the optimal portfolio from IOM Andthe Sharpe ratio shows that median-based optimal portfolioperforms better as well
Mathematical Problems in Engineering 7
1 1098765432The number of parameters set
IOMRIOM
0001002003004005006007008009
Retu
rn ra
te o
f inv
estm
ent
Figure 2 Performance comparison
Table 2 Rate of return from the optimal portfolio on the Prosperdataset
Subset IOM RIOM1 00501 005662 00550 006333 00540 006184 00564 006965 00627 007146 00543 006297 00532 006888 00605 007119 00593 0070610 00546 0066411 00637 0070112 00567 0064013 00468 0056914 00519 0066315 00544 0062016 00357 0047217 00588 0071018 00607 0077419 00544 0065520 00625 00808Average 00553 00662
In order to test and verify that the conclusions obtainedfrom the above experiments are stable we consider dif-ferent investment amounts and required returns as inputparameters for portfolio selection and keep other conditionsunchanged As summarized in Table 3 we consider nineparameters pairs about required return rate 119877lowast and invest-ment amount M
The computational results for each parameters pair aresummarized in Table 4 Table 4 shows performance compar-ison of the two optimal portfolios from the perspectives ofactual return rate of investment The more intuitive resultsare shown in Figure 2 which shows the actual return ratecomparison of the two models The first 9 numbers ofthe horizontal axis in Figure 2 represent the correspondingparameters combinations (sets 1 through 9 fromTable 3) and
Table 3 Investorsrsquo choices of input parameters for portfolio selec-tion
Set Investment amountM Required rate 119877lowast1 $10000 502 $10000 553 $10000 604 $15000 505 $15000 556 $15000 607 $20000 508 $20000 559 $20000 60
the number 10 shows the average We can find that the RIOMmodel outperforms the IOMmodel comprehensively
In conclusion the optimal portfolio identified from therobust optimization model in this study is more efficient thanthe existing model And the performance of our model ismore robust and stable
6 Conclusions
In this paper we formulate a data-driven robust modelof portfolio optimization with relative entropy constraintsbased on an instance-based credit risk assessment frameworkfor investment decisions in P2P lending This P2P lendinginvestment decision model has at least three advantagesFirstly it provides a more refined measure of P2P loansrsquo riskand reveals a more intuitive and quantized risk estimate toinvestors instead of just labelling each loan with a creditgrade Secondly this model can estimate each loanrsquos expectedreturn and risk when the historical observation of the sameborrower is unavailable Finally this model considers theloansrsquo distribution ambiguity (probability measure uncer-tainty) problem and uses relative entropy tomodel parameteruncertainty to ensure the optimal allocation strategy effi-cient and feasible under various actual scenarios Numericalexperiments imply that the P2P lending investment decisionmodel using the robust optimization with relative entropyconstraints provides better performance than existing model
8 Mathematical Problems in Engineering
Table4Investm
entp
erform
anceso
finp
utparametersfor
portfolio
selection
Subset
119877lowast=5
119877lowast
=55
119877lowast
=6
119877lowast=5
119877lowast
=55
119877lowast
=6
119877lowast
=5
119877lowast =
55
119877lowast =
6
M=10000
M=10000
M=10000
M=15000
M=15000
M=15000
M=20000
M=20000
M=20000
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
100598
007
27006
0100762
00502
007
2200501
005
6600558
008
3900520
008
3300594
007
7400544
006
8900691
006
492
00500
005
92006
01007
79006
75008
5100550
006
3300517
006
2000551
006
4800504
005
84006
64008
95006
6100769
3004
41005
7100491
006
9800735
009
3400540
006
1800598
007
7800631
006
4600503
005
6800554
007
37006
47008
694
00525
006
0200658
009
0300636
007
5400564
006
9600512
006
2900553
008
4700566
006
4800617
008
5600518
006
355
00532
006
2000631
008
2900513
007
3100627
007
1400566
007
2100616
008
9600576
007
0900547
006
9200610
007
716
00634
00747
00564
00762
00717
01105
00543
006
2900570
007
7200585
008
7400584
00 6
8 300528
007
0100516
003
857
00613
007
3600547
007
5400551
008
8400532
006
8800528
007
2700620
005
66004
81005
49004
85006
31004
60007
598
00529
005
9400505
006
16006
85008
58006
05007
1100545
00768
006
45008
5500545
007
0100628
008
2300592
008
459
00548
006
4700550
007
3600559
004
9300593
007
0600507
005
7600574
01214
00535
005
8600561
00764
00574
01038
10004
74005
74004
72006
3400499
007
9400546
006
6400528
006
3400622
006
3100514
005
9700582
006
83006
8900532
1100597
007
30006
02007
95006
6101090
00637
007
0100562
007
5600498
006
6200531
005
8400569
006
5700572
01141
12006
4400768
00541
006
7300624
01 042
00567
006
4000529
006
7700574
009
8300551
006
7800536
007
3400618
006
8713
00635
007
8500709
008
9500532
006
62004
68005
6900637
008
8000504
009
1300555
006
9500636
008
2400616
01157
1400593
00744
00626
007
5100634
01204
00519
006
6300568
007
1600614
01162
00577
006
7400541
006
3600572
008
1815
00523
006
36004
85006
0900571
009
8700544
006
2000577
00764
00633
008
0200597
007
7500536
007
0600595
007
0416
00549
007
05006
84008
9300508
01264
00357
004
72006
42008
5100573
005
4900593
006
9800616
008
0700551
00748
1700549
006
6600549
007
5700538
006
7700588
007
10006
74008
6700615
004
9600535
006
41004
87006
3600 69 6
009
1518
00546
006
2900512
006
1500560
006
10006
07007
7400585
007
32006
87008
2400599
007
2900576
00746
00507
01069
1900492
005
5500572
006
8500657
004
3600544
006
5500434
006
3300589
007
5900581
006
73004
72006
3800623
01148
2000554
006
45004
13005
0400596
003
6600625
008
0800562
006
8700698
009
7800518
007
09006
01007
3400638
00744
Average
00554
006
6400566
007
3200598
008
2300553
006
6200560
007
3100595
008
0700552
006
6300564
007
2900597
008
19
Mathematical Problems in Engineering 9
Data Availability
The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx
Conflicts of Interest
The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo
Acknowledgments
The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above
References
[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952
[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959
[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002
[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002
[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016
[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016
[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014
[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011
[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968
[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970
[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980
[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983
[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011
[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007
[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000
[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006
[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007
[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015
[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018
[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009
[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011
[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010
[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010
[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014
[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001
[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007
[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018
[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018
[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007
[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004
[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013
[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965
10 Mathematical Problems in Engineering
[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015
[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017
[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018
[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012
[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012
[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
6 Mathematical Problems in Engineering
009500955
009600965
009700975
009800985
009900995
01
CV (h
)002 004 006 008 01 012 014 016 018 020
h
Figure 1 The curve of CV (h)
There exist many credit scoring models to predict thedefault probability of a loan such as Xgboost model [33ndash35] hybrid KMVmodel [36] credit scoring based on geneticalgorithms [37 38] and so on However discussing how tochoose and construct the optimal credit scoring model isbeyond the scope of this study and we use the most popularmodel logistic regression to make the prediction in thispreprocessing step
We randomly divide the dataset into two parts onecontaining 40 of all loans for determining the optimalbandwidth h in (5) which will be described in detail inSection 53 and the second part containing 60 of the loansMoreover using k-fold cross-validation we randomly dividethe second part into 20 subsets each of which containsapproximately 510 loans In each round one of the subsetsis used as the testing set which consists of loans waiting tobe invested thus their pay-back statuses are unknown andall other subsets are taken as a training set which consists ofhistorical loans with known yield
52 Model Description In this paper we propose a robustcredit portfolio optimization model for investment decisionsin P2P lending In order to show its effectiveness we compareit with a benchmark model proposed by Guo et al [5] In thefollowing we describe models in detail
IOM is the instance-based model proposed by Guo etal [5] Each loan is assessed using kernel weights and thehistorical performance of similar loans Then use the classicalmean-variance model (8) to identify the optimal allocationstrategy The performance of this model outperforms somerating-based models as the results of Guo et al [5] show
RIOM is the robust instance-based model in this studyExpected return and risk of each loan are also assessed basedon the ldquoinstance-basedrdquo assessment framework However weuse the robust model of credit portfolio optimization basedon relative entropy method Equation (15) to obtain theoptimal investment decision
We compare the two models by the following procedure(1) Train the credit risk assessment model with the
training set and use the trained model to predict theexpected return (120583119894) and variance (120590119894) of each loan inthe testing set Thus the expected return vector andthe covariance matrix 120583 and V can be obtained
(2) For each model feed the predicted expected returnvector 120583 and the covariance matrix 119881 of the testingloans into the portfolio optimization algorithm andcompute the performance of investment on the opti-mal portfolio
(3) Compare the return rate of the two models
53 Analysis of Results As mentioned before we select theGaussian kernel 119870(120577) = (1radic2120587)119890minus12057722 as the kernel func-tion And the important parameter in the kernel regressionmodel bandwidth h is optimized by the following leave-one-out cross validation
ℎ119900119901119905119894119898119886119897 = argminℎ119862119881 (ℎ)
= argminℎ
119899sum119894=1
(120583ℎ (119901minus119894) minus 120583119894)2 (16)
where 120583ℎ(119901minus119894) is the leave-one-out estimation of expectedreturn rate 120583119894 specifically
120583ℎ (119901minus119894) =119899sum119895=1119895 =119894
[[
119870((119901119894 minus 119901119895) ℎ)sum119899119895=1119895 =119894119870((119901119894 minus 119901119895) ℎ) sdot 119877119895
]] (17)
The curve of CV(h) is exhibited in Figure 1 The shape of thecurve clearly shows a minimal point and h corresponding tothe minimal point is the optimal bandwith for the model
To apply the robust credit portfolio optimization methodto obtain the optimal investment strategy in problems (13)we select the parameter 120577=075 the investment amount M =15 thousand dollars and the required rate of return 119877lowast = 005We also set the risk-free return rate as 0025 which is aboutequivalent to the average yield of T-Bills over the sameperiodAnd we use the MATLAB built-in solver ldquoquaprogrdquo to solvethe two portfolio optimization problems
Table 2 summarizes investment return rate of each testsubset and the average performance of the Prosper dataset Itshows that the two portfolios are almost always efficient andfeasible except subset 16The results also show that the actualperformances of the optimal portfolio derived from RIOMalways outperform the optimal portfolio from IOM Andthe Sharpe ratio shows that median-based optimal portfolioperforms better as well
Mathematical Problems in Engineering 7
1 1098765432The number of parameters set
IOMRIOM
0001002003004005006007008009
Retu
rn ra
te o
f inv
estm
ent
Figure 2 Performance comparison
Table 2 Rate of return from the optimal portfolio on the Prosperdataset
Subset IOM RIOM1 00501 005662 00550 006333 00540 006184 00564 006965 00627 007146 00543 006297 00532 006888 00605 007119 00593 0070610 00546 0066411 00637 0070112 00567 0064013 00468 0056914 00519 0066315 00544 0062016 00357 0047217 00588 0071018 00607 0077419 00544 0065520 00625 00808Average 00553 00662
In order to test and verify that the conclusions obtainedfrom the above experiments are stable we consider dif-ferent investment amounts and required returns as inputparameters for portfolio selection and keep other conditionsunchanged As summarized in Table 3 we consider nineparameters pairs about required return rate 119877lowast and invest-ment amount M
The computational results for each parameters pair aresummarized in Table 4 Table 4 shows performance compar-ison of the two optimal portfolios from the perspectives ofactual return rate of investment The more intuitive resultsare shown in Figure 2 which shows the actual return ratecomparison of the two models The first 9 numbers ofthe horizontal axis in Figure 2 represent the correspondingparameters combinations (sets 1 through 9 fromTable 3) and
Table 3 Investorsrsquo choices of input parameters for portfolio selec-tion
Set Investment amountM Required rate 119877lowast1 $10000 502 $10000 553 $10000 604 $15000 505 $15000 556 $15000 607 $20000 508 $20000 559 $20000 60
the number 10 shows the average We can find that the RIOMmodel outperforms the IOMmodel comprehensively
In conclusion the optimal portfolio identified from therobust optimization model in this study is more efficient thanthe existing model And the performance of our model ismore robust and stable
6 Conclusions
In this paper we formulate a data-driven robust modelof portfolio optimization with relative entropy constraintsbased on an instance-based credit risk assessment frameworkfor investment decisions in P2P lending This P2P lendinginvestment decision model has at least three advantagesFirstly it provides a more refined measure of P2P loansrsquo riskand reveals a more intuitive and quantized risk estimate toinvestors instead of just labelling each loan with a creditgrade Secondly this model can estimate each loanrsquos expectedreturn and risk when the historical observation of the sameborrower is unavailable Finally this model considers theloansrsquo distribution ambiguity (probability measure uncer-tainty) problem and uses relative entropy tomodel parameteruncertainty to ensure the optimal allocation strategy effi-cient and feasible under various actual scenarios Numericalexperiments imply that the P2P lending investment decisionmodel using the robust optimization with relative entropyconstraints provides better performance than existing model
8 Mathematical Problems in Engineering
Table4Investm
entp
erform
anceso
finp
utparametersfor
portfolio
selection
Subset
119877lowast=5
119877lowast
=55
119877lowast
=6
119877lowast=5
119877lowast
=55
119877lowast
=6
119877lowast
=5
119877lowast =
55
119877lowast =
6
M=10000
M=10000
M=10000
M=15000
M=15000
M=15000
M=20000
M=20000
M=20000
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
100598
007
27006
0100762
00502
007
2200501
005
6600558
008
3900520
008
3300594
007
7400544
006
8900691
006
492
00500
005
92006
01007
79006
75008
5100550
006
3300517
006
2000551
006
4800504
005
84006
64008
95006
6100769
3004
41005
7100491
006
9800735
009
3400540
006
1800598
007
7800631
006
4600503
005
6800554
007
37006
47008
694
00525
006
0200658
009
0300636
007
5400564
006
9600512
006
2900553
008
4700566
006
4800617
008
5600518
006
355
00532
006
2000631
008
2900513
007
3100627
007
1400566
007
2100616
008
9600576
007
0900547
006
9200610
007
716
00634
00747
00564
00762
00717
01105
00543
006
2900570
007
7200585
008
7400584
00 6
8 300528
007
0100516
003
857
00613
007
3600547
007
5400551
008
8400532
006
8800528
007
2700620
005
66004
81005
49004
85006
31004
60007
598
00529
005
9400505
006
16006
85008
58006
05007
1100545
00768
006
45008
5500545
007
0100628
008
2300592
008
459
00548
006
4700550
007
3600559
004
9300593
007
0600507
005
7600574
01214
00535
005
8600561
00764
00574
01038
10004
74005
74004
72006
3400499
007
9400546
006
6400528
006
3400622
006
3100514
005
9700582
006
83006
8900532
1100597
007
30006
02007
95006
6101090
00637
007
0100562
007
5600498
006
6200531
005
8400569
006
5700572
01141
12006
4400768
00541
006
7300624
01 042
00567
006
4000529
006
7700574
009
8300551
006
7800536
007
3400618
006
8713
00635
007
8500709
008
9500532
006
62004
68005
6900637
008
8000504
009
1300555
006
9500636
008
2400616
01157
1400593
00744
00626
007
5100634
01204
00519
006
6300568
007
1600614
01162
00577
006
7400541
006
3600572
008
1815
00523
006
36004
85006
0900571
009
8700544
006
2000577
00764
00633
008
0200597
007
7500536
007
0600595
007
0416
00549
007
05006
84008
9300508
01264
00357
004
72006
42008
5100573
005
4900593
006
9800616
008
0700551
00748
1700549
006
6600549
007
5700538
006
7700588
007
10006
74008
6700615
004
9600535
006
41004
87006
3600 69 6
009
1518
00546
006
2900512
006
1500560
006
10006
07007
7400585
007
32006
87008
2400599
007
2900576
00746
00507
01069
1900492
005
5500572
006
8500657
004
3600544
006
5500434
006
3300589
007
5900581
006
73004
72006
3800623
01148
2000554
006
45004
13005
0400596
003
6600625
008
0800562
006
8700698
009
7800518
007
09006
01007
3400638
00744
Average
00554
006
6400566
007
3200598
008
2300553
006
6200560
007
3100595
008
0700552
006
6300564
007
2900597
008
19
Mathematical Problems in Engineering 9
Data Availability
The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx
Conflicts of Interest
The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo
Acknowledgments
The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above
References
[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952
[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959
[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002
[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002
[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016
[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016
[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014
[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011
[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968
[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970
[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980
[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983
[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011
[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007
[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000
[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006
[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007
[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015
[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018
[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009
[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011
[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010
[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010
[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014
[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001
[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007
[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018
[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018
[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007
[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004
[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013
[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965
10 Mathematical Problems in Engineering
[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015
[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017
[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018
[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012
[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012
[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
Mathematical Problems in Engineering 7
1 1098765432The number of parameters set
IOMRIOM
0001002003004005006007008009
Retu
rn ra
te o
f inv
estm
ent
Figure 2 Performance comparison
Table 2 Rate of return from the optimal portfolio on the Prosperdataset
Subset IOM RIOM1 00501 005662 00550 006333 00540 006184 00564 006965 00627 007146 00543 006297 00532 006888 00605 007119 00593 0070610 00546 0066411 00637 0070112 00567 0064013 00468 0056914 00519 0066315 00544 0062016 00357 0047217 00588 0071018 00607 0077419 00544 0065520 00625 00808Average 00553 00662
In order to test and verify that the conclusions obtainedfrom the above experiments are stable we consider dif-ferent investment amounts and required returns as inputparameters for portfolio selection and keep other conditionsunchanged As summarized in Table 3 we consider nineparameters pairs about required return rate 119877lowast and invest-ment amount M
The computational results for each parameters pair aresummarized in Table 4 Table 4 shows performance compar-ison of the two optimal portfolios from the perspectives ofactual return rate of investment The more intuitive resultsare shown in Figure 2 which shows the actual return ratecomparison of the two models The first 9 numbers ofthe horizontal axis in Figure 2 represent the correspondingparameters combinations (sets 1 through 9 fromTable 3) and
Table 3 Investorsrsquo choices of input parameters for portfolio selec-tion
Set Investment amountM Required rate 119877lowast1 $10000 502 $10000 553 $10000 604 $15000 505 $15000 556 $15000 607 $20000 508 $20000 559 $20000 60
the number 10 shows the average We can find that the RIOMmodel outperforms the IOMmodel comprehensively
In conclusion the optimal portfolio identified from therobust optimization model in this study is more efficient thanthe existing model And the performance of our model ismore robust and stable
6 Conclusions
In this paper we formulate a data-driven robust modelof portfolio optimization with relative entropy constraintsbased on an instance-based credit risk assessment frameworkfor investment decisions in P2P lending This P2P lendinginvestment decision model has at least three advantagesFirstly it provides a more refined measure of P2P loansrsquo riskand reveals a more intuitive and quantized risk estimate toinvestors instead of just labelling each loan with a creditgrade Secondly this model can estimate each loanrsquos expectedreturn and risk when the historical observation of the sameborrower is unavailable Finally this model considers theloansrsquo distribution ambiguity (probability measure uncer-tainty) problem and uses relative entropy tomodel parameteruncertainty to ensure the optimal allocation strategy effi-cient and feasible under various actual scenarios Numericalexperiments imply that the P2P lending investment decisionmodel using the robust optimization with relative entropyconstraints provides better performance than existing model
8 Mathematical Problems in Engineering
Table4Investm
entp
erform
anceso
finp
utparametersfor
portfolio
selection
Subset
119877lowast=5
119877lowast
=55
119877lowast
=6
119877lowast=5
119877lowast
=55
119877lowast
=6
119877lowast
=5
119877lowast =
55
119877lowast =
6
M=10000
M=10000
M=10000
M=15000
M=15000
M=15000
M=20000
M=20000
M=20000
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
100598
007
27006
0100762
00502
007
2200501
005
6600558
008
3900520
008
3300594
007
7400544
006
8900691
006
492
00500
005
92006
01007
79006
75008
5100550
006
3300517
006
2000551
006
4800504
005
84006
64008
95006
6100769
3004
41005
7100491
006
9800735
009
3400540
006
1800598
007
7800631
006
4600503
005
6800554
007
37006
47008
694
00525
006
0200658
009
0300636
007
5400564
006
9600512
006
2900553
008
4700566
006
4800617
008
5600518
006
355
00532
006
2000631
008
2900513
007
3100627
007
1400566
007
2100616
008
9600576
007
0900547
006
9200610
007
716
00634
00747
00564
00762
00717
01105
00543
006
2900570
007
7200585
008
7400584
00 6
8 300528
007
0100516
003
857
00613
007
3600547
007
5400551
008
8400532
006
8800528
007
2700620
005
66004
81005
49004
85006
31004
60007
598
00529
005
9400505
006
16006
85008
58006
05007
1100545
00768
006
45008
5500545
007
0100628
008
2300592
008
459
00548
006
4700550
007
3600559
004
9300593
007
0600507
005
7600574
01214
00535
005
8600561
00764
00574
01038
10004
74005
74004
72006
3400499
007
9400546
006
6400528
006
3400622
006
3100514
005
9700582
006
83006
8900532
1100597
007
30006
02007
95006
6101090
00637
007
0100562
007
5600498
006
6200531
005
8400569
006
5700572
01141
12006
4400768
00541
006
7300624
01 042
00567
006
4000529
006
7700574
009
8300551
006
7800536
007
3400618
006
8713
00635
007
8500709
008
9500532
006
62004
68005
6900637
008
8000504
009
1300555
006
9500636
008
2400616
01157
1400593
00744
00626
007
5100634
01204
00519
006
6300568
007
1600614
01162
00577
006
7400541
006
3600572
008
1815
00523
006
36004
85006
0900571
009
8700544
006
2000577
00764
00633
008
0200597
007
7500536
007
0600595
007
0416
00549
007
05006
84008
9300508
01264
00357
004
72006
42008
5100573
005
4900593
006
9800616
008
0700551
00748
1700549
006
6600549
007
5700538
006
7700588
007
10006
74008
6700615
004
9600535
006
41004
87006
3600 69 6
009
1518
00546
006
2900512
006
1500560
006
10006
07007
7400585
007
32006
87008
2400599
007
2900576
00746
00507
01069
1900492
005
5500572
006
8500657
004
3600544
006
5500434
006
3300589
007
5900581
006
73004
72006
3800623
01148
2000554
006
45004
13005
0400596
003
6600625
008
0800562
006
8700698
009
7800518
007
09006
01007
3400638
00744
Average
00554
006
6400566
007
3200598
008
2300553
006
6200560
007
3100595
008
0700552
006
6300564
007
2900597
008
19
Mathematical Problems in Engineering 9
Data Availability
The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx
Conflicts of Interest
The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo
Acknowledgments
The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above
References
[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952
[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959
[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002
[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002
[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016
[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016
[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014
[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011
[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968
[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970
[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980
[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983
[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011
[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007
[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000
[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006
[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007
[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015
[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018
[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009
[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011
[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010
[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010
[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014
[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001
[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007
[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018
[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018
[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007
[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004
[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013
[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965
10 Mathematical Problems in Engineering
[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015
[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017
[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018
[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012
[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012
[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
8 Mathematical Problems in Engineering
Table4Investm
entp
erform
anceso
finp
utparametersfor
portfolio
selection
Subset
119877lowast=5
119877lowast
=55
119877lowast
=6
119877lowast=5
119877lowast
=55
119877lowast
=6
119877lowast
=5
119877lowast =
55
119877lowast =
6
M=10000
M=10000
M=10000
M=15000
M=15000
M=15000
M=20000
M=20000
M=20000
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
IOM
RIOM
100598
007
27006
0100762
00502
007
2200501
005
6600558
008
3900520
008
3300594
007
7400544
006
8900691
006
492
00500
005
92006
01007
79006
75008
5100550
006
3300517
006
2000551
006
4800504
005
84006
64008
95006
6100769
3004
41005
7100491
006
9800735
009
3400540
006
1800598
007
7800631
006
4600503
005
6800554
007
37006
47008
694
00525
006
0200658
009
0300636
007
5400564
006
9600512
006
2900553
008
4700566
006
4800617
008
5600518
006
355
00532
006
2000631
008
2900513
007
3100627
007
1400566
007
2100616
008
9600576
007
0900547
006
9200610
007
716
00634
00747
00564
00762
00717
01105
00543
006
2900570
007
7200585
008
7400584
00 6
8 300528
007
0100516
003
857
00613
007
3600547
007
5400551
008
8400532
006
8800528
007
2700620
005
66004
81005
49004
85006
31004
60007
598
00529
005
9400505
006
16006
85008
58006
05007
1100545
00768
006
45008
5500545
007
0100628
008
2300592
008
459
00548
006
4700550
007
3600559
004
9300593
007
0600507
005
7600574
01214
00535
005
8600561
00764
00574
01038
10004
74005
74004
72006
3400499
007
9400546
006
6400528
006
3400622
006
3100514
005
9700582
006
83006
8900532
1100597
007
30006
02007
95006
6101090
00637
007
0100562
007
5600498
006
6200531
005
8400569
006
5700572
01141
12006
4400768
00541
006
7300624
01 042
00567
006
4000529
006
7700574
009
8300551
006
7800536
007
3400618
006
8713
00635
007
8500709
008
9500532
006
62004
68005
6900637
008
8000504
009
1300555
006
9500636
008
2400616
01157
1400593
00744
00626
007
5100634
01204
00519
006
6300568
007
1600614
01162
00577
006
7400541
006
3600572
008
1815
00523
006
36004
85006
0900571
009
8700544
006
2000577
00764
00633
008
0200597
007
7500536
007
0600595
007
0416
00549
007
05006
84008
9300508
01264
00357
004
72006
42008
5100573
005
4900593
006
9800616
008
0700551
00748
1700549
006
6600549
007
5700538
006
7700588
007
10006
74008
6700615
004
9600535
006
41004
87006
3600 69 6
009
1518
00546
006
2900512
006
1500560
006
10006
07007
7400585
007
32006
87008
2400599
007
2900576
00746
00507
01069
1900492
005
5500572
006
8500657
004
3600544
006
5500434
006
3300589
007
5900581
006
73004
72006
3800623
01148
2000554
006
45004
13005
0400596
003
6600625
008
0800562
006
8700698
009
7800518
007
09006
01007
3400638
00744
Average
00554
006
6400566
007
3200598
008
2300553
006
6200560
007
3100595
008
0700552
006
6300564
007
2900597
008
19
Mathematical Problems in Engineering 9
Data Availability
The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx
Conflicts of Interest
The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo
Acknowledgments
The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above
References
[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952
[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959
[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002
[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002
[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016
[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016
[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014
[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011
[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968
[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970
[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980
[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983
[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011
[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007
[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000
[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006
[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007
[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015
[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018
[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009
[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011
[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010
[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010
[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014
[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001
[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007
[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018
[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018
[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007
[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004
[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013
[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965
10 Mathematical Problems in Engineering
[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015
[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017
[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018
[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012
[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012
[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
Mathematical Problems in Engineering 9
Data Availability
The data this paper used is downloaded from the website ofProsper httpswwwprospercominvestdownloadaspx
Conflicts of Interest
The authors declare that there are no conflicts of interestregarding the publication of this paperrdquo
Acknowledgments
The research is supported by the National Natural ScienceFoundation of China (Grants nos 71471027 71731003 and71873103) the National Social Science Foundation of China(Grant no 16BTJ017) National Natural Science Foundationof China Youth Project (Grant no 71601041) LiaoningEconomic and Social Development Key Issues (Grant no2015lslktzdian-05) and Liaoning Provincial Social SciencePlanning Fund Project (Grant no L16BJY016) The authorsacknowledge the organizations mentioned above
References
[1] H Markowitz ldquoPortfolio selectionrdquoe Journal of Finance vol7 no 1 pp 77ndash91 1952
[2] H M Markowitz Portfolio Selection Efficient Diversication ofInvestment Wiley New York NY USA 1959
[3] N Larsen H Mausser and S Uryasev ldquoAlgorithms for opti-mization ofValue-atRiskrdquo in Financial Engineering ECommerceand Supply Chain Applied Optimization P M Pardalos andV K Tsitsiringos Eds vol 70 Kluwer Academic PublishersDordrecht 2002
[4] R T Rockafellar and S Uryasev ldquoConditional value-at-risk forgeneral loss distributionsrdquo Journal of Bankingamp Finance vol 26no 7 pp 1443ndash1471 2002
[5] Y H Guo W J Zhou C Y Luo C R Liu and H XiongldquoInstance-based credit risk assessment for investment decisionsin P2P Lendingrdquo European Journal of Operational Research vol249 no 2 pp 417ndash426 2016
[6] S C P Yam H Yang and F L Yuen ldquoOptimal asset allocationRisk and information uncertaintyrdquo European Journal of Opera-tional Research vol 251 no 2 pp 554ndash561 2016
[7] R Emekter Y Tu B Jirasakuldech and M Lu ldquoEvaluatingcredit risk and loan performance in online Peer-to-Peer (P2P)lendingrdquo Applied Economics vol 47 no 1 pp 54ndash70 2014
[8] E Berkovich ldquoSearch and herding effects in peer-to-peerlending evidence from prospercomrdquo Annals of Finance vol 7no 3 pp 389ndash405 2011
[9] E I Altman ldquoFinancial ratios discriminant analysis and theprediction of corporate bankruptcyrdquoe Journal of Finance vol23 no 4 pp 589ndash609 1968
[10] S Chatterjee and S Barcun ldquoA nonparametric approach tocredit screeningrdquo Publications of the American Statistical Asso-ciation vol 65 no 329 pp 150ndash154 1970
[11] J C Wigintor ldquoA note on the comparison of logit and discrim-inant models of consumer credit behaviorrdquo Journal of Financialand Quantitative Analysis vol 15 no 3 pp 757ndash770 1980
[12] L Breiman J H Friedman R Olshen and C Stone Classifi-cation and Regression Trees Wadsworth Belmont Calif USA1983
[13] M M So and L C Thomas ldquoModelling the profitability ofcredit cards by Markov decision processesrdquo European Journalof Operational Research vol 212 no 1 pp 123ndash130 2011
[14] G Andreeva J Ansell and J Crook ldquoModelling profitabilityusing survival combination scoresrdquo European Journal of Opera-tional Research vol 183 no 3 pp 1537ndash1549 2007
[15] D West ldquoNeural network credit scoring modelsrdquo Computers ampOperations Research vol 27 pp 1131ndash1152 2000
[16] J J Huang G H Tzeng and C S Ong ldquoTwo-stage geneticprogramming (2SGP) for the credit scoring modelrdquo AppliedMathematics and Computation vol 174 no 2 pp 1039ndash10532006
[17] C L Huang M C Chen and C J Wang ldquoCredit scoring witha data mining approach based on support vector machinesrdquoExpert Systems with Applications vol 33 no 4 pp 847ndash8562007
[18] P Danenas and G Garsva ldquoSelection of support vectormachines based classifiers for credit risk domainrdquo ExpertSystems with Applications vol 42 no 6 pp 3194ndash3204 2015
[19] G Sermpinis S Tsoukas and P Zhang ldquoModelling marketimplied ratings using LASSO variable selection techniquesrdquoJournal of Empirical Finance vol 48 pp 19ndash35 2018
[20] K Natarajan D Pachamanova andM Sim ldquoConstructing riskmeasures from uncertainty setsrdquo Operations Research vol 57no 5 pp 1129ndash1141 2009
[21] L Chen S He and S Zhang ldquoTight bounds for some riskmeasures with applications to robust portfolio selectionrdquoOper-ations Research vol 59 no 4 pp 847ndash865 2011
[22] L G Epstein ldquoA paradox for the ldquosmooth ambiguityrdquorsquo model ofpreferencerdquo Econometrica vol 78 no 6 pp 2085ndash2099 2010
[23] K Natarajan M Sim and J Uichanco ldquoTractable robustexpected utility and risk models for portfolio optimizationrdquoMathematical Finance vol 20 no 4 pp 695ndash731 2010
[24] A B Pac and M C Pınar ldquoRobust portfolio choice with CVaRand VaR under distribution and mean return ambiguityrdquo TOPvol 22 no 3 pp 875ndash891 2014
[25] L P Hansen and T J Sargent ldquoRobust control and modeluncertaintyrdquoe American Economic Review vol 91 no 2 pp60ndash66 2001
[26] G C Calafiore ldquoAmbiguous risk measures and optimal robustportfoliosrdquo Society for Industrial and Applied Mathematics vol18 no 3 pp 853ndash877 2007
[27] D Bertsimas V Gupta and N Kallus ldquoData-driven robustoptimizationrdquo Mathematical Programming vol 167 no 2 pp235ndash292 2018
[28] Z Kang X Li Z Li and S Zhu ldquoData-driven robust mean-CVaR portfolio selection under distribution ambiguityrdquo Quan-titative Finance pp 1ndash17 2018
[29] Q Li and J S Racine Nonparametric Econometrics eory andPractice Princeton University Press 2007
[30] O Scaillet ldquoNonparametric estimation and sensitivity analysisof expected shortfallrdquo Mathematical Finance vol 14 no 1 pp115ndash129 2004
[31] H Yao Z Li and Y Lai ldquoMeanndashCVaR portfolio selection Anonparametric estimation frameworkrdquo Computers amp Opera-tions Research vol 40 no 4 pp 1014ndash1022 2013
[32] E A Nadaraja ldquoOn non-parametric estimates of density func-tions and regressionrdquo eory of Probability amp Its Applicationsvol 10 no 1 pp 186ndash190 1965
10 Mathematical Problems in Engineering
[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015
[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017
[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018
[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012
[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012
[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
10 Mathematical Problems in Engineering
[33] T Chen and T He ldquoHiggs boson discovery with boostedtreesrdquo in Proceedings of the NIPS 2014Workshop on High-energyPhysics and Machine Learning pp 69ndash80 2015
[34] Y Xia C Liu Y Li and N Liu ldquoA boosted decision treeapproach using Bayesian hyper-parameter optimization forcredit scoringrdquo Expert Systems with Applications vol 78 pp225ndash241 2017
[35] H He W Zhang and S Zhang ldquoA novel ensemble method forcredit scoring Adaption of different imbalance ratiosrdquo ExpertSystems with Applications vol 98 pp 105ndash117 2018
[36] C-C Yeh F Lin and C-Y Hsu ldquoA hybrid KMV modelrandom forests and rough set theory approach for credit ratingrdquoKnowledge-Based Systems vol 33 no 3 pp 166ndash172 2012
[37] SOreski DOreski andGOreski ldquoHybrid systemwith geneticalgorithm and artificial neural networks and its application toretail credit risk assessmentrdquo Expert Systems with Applicationsvol 39 no 16 pp 12605ndash12617 2012
[38] V Kozeny ldquoGenetic algorithms for credit scoring Alternativefitness function performance comparisonrdquo Expert Systems withApplications vol 42 no 6 pp 2998ndash3004 2015
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom
Hindawiwwwhindawicom Volume 2018
MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Mathematical Problems in Engineering
Applied MathematicsJournal of
Hindawiwwwhindawicom Volume 2018
Probability and StatisticsHindawiwwwhindawicom Volume 2018
Journal of
Hindawiwwwhindawicom Volume 2018
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawiwwwhindawicom Volume 2018
OptimizationJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Engineering Mathematics
International Journal of
Hindawiwwwhindawicom Volume 2018
Operations ResearchAdvances in
Journal of
Hindawiwwwhindawicom Volume 2018
Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018
International Journal of Mathematics and Mathematical Sciences
Hindawiwwwhindawicom Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Hindawiwwwhindawicom Volume 2018Volume 2018
Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in
Nature and SocietyHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Dierential EquationsInternational Journal of
Volume 2018
Hindawiwwwhindawicom Volume 2018
Decision SciencesAdvances in
Hindawiwwwhindawicom Volume 2018
AnalysisInternational Journal of
Hindawiwwwhindawicom Volume 2018
Stochastic AnalysisInternational Journal of
Submit your manuscripts atwwwhindawicom