arxiv:2105.00336v1 [cs.lg] 1 may 2021

47
arXiv:2105.00336v1 [cs.LG] 1 May 2021 Noname manuscript No. (will be inserted by the editor) Comprehensive Review On Twin Support Vector Machines M. Tanveer · T. Rajani · R. Rastogi · Y.H. Shao Received: January 2021 / Accepted: Abstract Twin support vector machine (TSVM) and twin support vector regression (TSVR) are newly emerging efficient machine learning techniques which offer promising solutions for classification and regression challenges re- spectively. TSVM is based upon the idea to identify two nonparallel hyper- planes which classify the data points to their respective classes. It requires to solve two small sized quadratic programming problems (QPPs) in lieu of solving single large size QPP in support vector machine (SVM) while TSVR is formulated on the lines of TSVM and requires to solve two SVM kind prob- lems. Although there has been good research progress on these techniques; there is limited literature on the comparison of different variants of TSVR. Thus, this review presents a rigorous analysis of recent research in TSVM and TSVR simultaneously mentioning their limitations and advantages. To begin with we first introduce the basic theory of TSVM and then focus on the various improvements and applications of TSVM, and then we introduce TSVR and its various enhancements. Finally, we suggest future research and development prospects. M. Tanveer Department of Mathematics, Indian Institute of Technology Indore, Simrol, Indore 453552, India E-mail: [email protected] T. Rajani Department of Mathematics, Indian Institute of Technology Indore, Simrol, Indore 453552, India E-mail: [email protected] R. Rastogi Department of Computer Science, South Asian University, New Delhi, India E-mail: [email protected] Y.H. Shao School of Management, Hainan University, Haikou, China, 570228 E-mail: [email protected]

Upload: others

Post on 05-Oct-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: arXiv:2105.00336v1 [cs.LG] 1 May 2021

arX

iv:2

105.

0033

6v1

[cs

.LG

] 1

May

202

1

Noname manuscript No.(will be inserted by the editor)

Comprehensive Review On Twin Support Vector

Machines

M. Tanveer · T. Rajani · R. Rastogi ·

Y.H. Shao

Received: January 2021 / Accepted:

Abstract Twin support vector machine (TSVM) and twin support vectorregression (TSVR) are newly emerging efficient machine learning techniqueswhich offer promising solutions for classification and regression challenges re-spectively. TSVM is based upon the idea to identify two nonparallel hyper-planes which classify the data points to their respective classes. It requiresto solve two small sized quadratic programming problems (QPPs) in lieu ofsolving single large size QPP in support vector machine (SVM) while TSVRis formulated on the lines of TSVM and requires to solve two SVM kind prob-lems. Although there has been good research progress on these techniques;there is limited literature on the comparison of different variants of TSVR.Thus, this review presents a rigorous analysis of recent research in TSVM andTSVR simultaneously mentioning their limitations and advantages. To beginwith we first introduce the basic theory of TSVM and then focus on the variousimprovements and applications of TSVM, and then we introduce TSVR andits various enhancements. Finally, we suggest future research and developmentprospects.

M. TanveerDepartment of Mathematics, Indian Institute of Technology Indore, Simrol, Indore 453552,IndiaE-mail: [email protected]

T. RajaniDepartment of Mathematics, Indian Institute of Technology Indore, Simrol, Indore 453552,IndiaE-mail: [email protected]

R. RastogiDepartment of Computer Science, South Asian University, New Delhi, IndiaE-mail: [email protected]

Y.H. Shao School of Management, Hainan University, Haikou, China, 570228E-mail: [email protected]

Page 2: arXiv:2105.00336v1 [cs.LG] 1 May 2021

2 M. Tanveer et al.

Keywords Machine learning · Twin support vector machines · Classification ·Regression · Clustering

1 Introduction

SVM [31] is a prominent classification technique widely used since its incep-tion. It was first introduced by Cortes and Vapnik [31] in 1995 for binaryclassification. SVM seeks to find decision hyper planes that determine the de-cision boundary which can classify the data points into two classes. Thesedecision planes are called support hyperplanes and the distance between themis maximized by solving a quadratic programming problem (QPP). SVM iscomputationally powerful even in non-linearly separable cases by using kerneltrick [32]. SVM has remarkable advantages as it utilizes the idea of structuralrisk minimization (SRM) principle which provides better generalization as wellas reduces error in the training phase. As a result of its superior performanceeven in non-linear classification problems, it has been implemented in a diversespectrum of research fields, ranging from text classification, face recognition,financial application, brain-computer interface, bio-medicine to human actionrecognition [1, 12, 54, 93, 99, 100, 148, 188, 193, 204, 205]. Although SVMhas outperformed most other systems, it still has many limitations in dealingwith complex data due to its high computational cost of solving QPPs and itsperformance highly depends upon the choice of kernel functions and its pa-rameters. Many improvements have been made in the last decade to enhancethe accuracy of SVM [85, 86]. One such critical enhancement was proposedin 2016 called generalized eigenvalue proximal SVM (GEPSVM) [90] that ledthe foundation of twin SVM (TSVM) later by Jayadeva et al. [60, 62]. Theidea behind GEPSVM is to determine two nonparallel hyperplanes throughsolving two eigenvalue problems. In 2007, Jayadeva et al. [60, 62] formulatedan algorithm called twin support vector machine (TSVM) to enhance the gen-eralization ability of GEPSVM. TSVM also determines two nonparallel hyper-planes and needs to solve a pair of small QPPs in lieu of solving one complexQPP in SVM. Computational cost of TSVM is approximately one-fourth of theSVM. TSVM is faster and has better generalization performance than SVMand GEPSVM. Authors also extended TSVM for nonlinear classification prob-lems by introducing kernel. But, in nonlinear cases when a kernel is used, theformulation requires to solve inverse of matrices. Also, its performance highlydepends upon the selection of kernel functions and its parameters for non-linear classification. Although TSVM is still in its rudimentary stage, manyimprovements and variants have been proposed by researchers due to its fa-vorable performance especially in-case of handling large datasets which is notpossible with the conventional SVM. Huang et al. [59] reviewed the researchprogress of TSVM until 2017 for classification problems.

Support Vector Machines has also been used effectively for regression andis called Support Vector Regression (SVR) [10]. SVR is different than SVM insome aspects as it sets a margin of tolerance ǫ and find the optimal hyperplane

Page 3: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 3

such that the margin is maximized and simultaneously error is minimized.Thus, it finds a function such that error can be maximum of ǫ distance, thusany error within ǫ deviation is acceptable and the function needs to be parallelto the x axis and should have a small slope. Learning speed of SVR is low asit needs to solve a large sized QPP. Researchers have proposed many variantsof SVR to improve its performance in terms of reducing computational com-plexity and improving accuracy [39, 170]. Some other variants of SVR includesǫ–support vector regression (ǫ–SVR) [162], Fuzzy SVM [29], ν–SVR [16], ro-bust and sparse SVR [180] and some other algorithms [40, 236]. ǫ–SVR hashigh computational cost and in order to remediate this, Peng developed anefficient twin support vector regression (TSVR) [109] which determines twononparallel hyperplanes that clusters the data points in two classes by opti-mizing the regression function. TSVR uses ǫ–insensitive up and down-boundfunctions to optimize the final regression function. TSVR formulation is com-putationally less complex as it needs to solve a pair of two smaller QPPs inlieu of solving one large QPP in SVR, thus it’s speed is much more than SVR.However, in TSVR, the spirit of TSVM was missing, thus, as an improve-ment over Peng’s TSVR, Khemchandani et al. [64] proposed improved TSVR(TWSVR) which works on the similar lines of TSVM. To further boost theperformance of TSVR, many algorithms have been introduced by researcherswhich are discussed in section 6.

In the past years, twin support vector classification and regression algo-rithms have developed rapidly and have been implemented to solve some real-life challenges. However, there is limited literature on twin support vectorregression as it is a relatively new theory and needs further study and im-provement. Thus, the objective of this paper is to present an compendium ofrecent developments in TSVM and TSVR, identify limitations and advantagesof these techniques and promote future developments.

The framework of this paper is as follows, section 2 presents brief aboutTSVM, section 3 and 4 include the recent advancements and applications ofTSVM respectively, section 5 presents a brief about TSVR, section 6 and 7include the recent advancements and applications of TSVR respectively andat last section 8 provides synopsis, future research and development prospects.

2 Twin Support Vector Machines

Suppose a dataset X ∈ Rn in which l1 data points are in class +1 (termed as

positive class) and l2 data points are in class −1 (termed as negative class)and these are represented by matrix A and B respectively. Accordingly, thesematrices will be of size (l1×n) and (l2×n) where n is feature space dimension.To categorize data samples which cannot be separated by linear functions,TSVM uses kernel functions to convert the higher dimensional data space tothe required form. The two kernel generated surfaces are given as below:

K(xT , DT )u+ + b+ = 0 and K(xT , DT )u− + b− = 0, (1)

Page 4: arXiv:2105.00336v1 [cs.LG] 1 May 2021

4 M. Tanveer et al.

here D = [A;B]; u+, u− ∈ Rn and K is a kernel function. The formulation for

nonlinear TSVM classification is defined as below:

minu+, b+, ξ1

1

2‖K(A, DT )u+ + e1b+‖

2 + c1eT2 ξ1

s.t. − (K(B, DT )u+ + e2b+) + ξ1 ≥ e2, ξ1 ≥ 0 (2)

and

minu−, b

−, ξ2

1

2‖K(B, DT )u− + e2b−‖

2 + c2eT1 ξ2

s.t. K(A, DT )u− + e1b− + ξ2 ≥ e1, ξ2 ≥ 0, (3)

here c1, c2 ≥ 0, e1, e2 are vectors of ones of suitable dimensions and ξ1, ξ2 arecalled slack variables. After using the Lagrange multipliers α ≥ 0, β ≥ 0 andthe Karush-Kuhn-Tucker (K.K.T.) [76] conditions, the duals of the QPPs in(2) and (3) are defined as below:

maxα

eT2 α−1

2αTQ(PTP )−1QTα

s.t. 0 ≤ α ≤ c1 (4)

and

maxβ

eT1 β −1

2βTP (QTQ)−1PTβ

s.t. 0 ≤ β ≤ c2, (5)

where P = [K(A, DT ) e1] and Q = [K(B, DT ) e2]. Final solutions of (4) and(5) are given as follows:

[

u+

b+

]

= −(PTP + δI)−1QTα, (6)

[

u−

b−

]

= (QTQ+ δI)−1PTβ, (7)

where δI is a regularization term and δ > 0.

A new sample x ∈ Rn is allocated to either class on the basis of the

proximity of kernel generated surface to x, i.e.,

class(i) = sign(K(xT , DT )u+ + b+

‖u+‖+

K(xT , DT )u− + b−

‖u−‖

)

, (8)

where sign(·) is signum function.

Page 5: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 5

3 Research progress on twin support vector machines

3.1 Least Squares Twin Support Vector Machines

To reduce TSVM training time, Kumar and Gopal [78] formulated least squaresTSVM (LS-TSVM) algorithm. The major advantage of LS-TSVM over TSVMis that it only deals with linear equations in place of QPPs in TSVM which re-duces the computational complexity of the model. LS-TSVM has comparableaccuracy but low computational time than TSVM. Although the computa-tional time of LS-TSVM is less than TSVM, its generalization performanceis poor as same penalties are allotted to the samples either being positive ornegative. LS-TSVM apply the empirical risk minimization (ERM) principlewhich affects accuracy and also causes over-fitting problem. Also, it doesn’ttake into account the effects of samples having different locations. To avoid thislimitation, Xu et al. [224] formulated weighted LS-TSVM that gives differentweights to the samples depending upon their locations. Experimental resultshave shown that WLS-TSVM yields better testing accuracy than TSVM andLS-TSVM but its computationally time is more than these algorithms.

To incorporate expert’s knowledge in LS-TSVM classifier, Kumar et al. [79]formulated a knowledge-based LS-TSVM (KBLS-TSVM) which includes poly-hedral knowledge sets in the formulation of LS-TSVM. Experimental resultshave shown that KBLS-TSVM is simple and more apt classifier as comparedto LS-TSVM. In order to improve accuracy of LS-TSVM, Wang et al. [206]incorporated the manifold geometric structure of data of each class. It requiresto solve a set of linear equations.

Although, LS-TSVM performed well with large datasets as compared toTSVM, Gao et al. [45] designed 1-norm LS-TSVM (NLS-TSVM) to auto-matically select the relevant features in order to strengthen the algorithm indealing with high-dimensional datasets. NLS-TSVM is based on LS-TSVMand includes a Tikhonov regularization term. To outdo LS-TSVM in accu-racy and avoid over-fitting problem, Xu et al. [230] formulated ImprovedLS-TSVM (ILS-TSVM) which improves the classification accuracy by imple-menting structural risk minimization principle. ILS-TSVM is faster and yieldscomparable generalization and computational time to LS-TSVM.

To take advantage of the correlation between some data points and reducethe effect of noise, Ye et al. [243] used density of the sample to allot weights foreach sample (DWLSC). Experimental results demonstrate better classificationaccuracy of the DWLSC than TSVM and LS-TSVM.

There are several real-life problems which are essentially multicategoryproblems. The extensions of TSVM to multicategory is also active researchdomain where several research papers have evolved addressing the same. How-ever, class imbalance problem is common in multicategory classification as allbinary SVMs are trained with all patterns. To deal with these issues Nasiriet al. [97] formulated energy-based LS-TSVM (ELS-TSVM) in which the con-straints of LS-TSVM are converted to an energy model to decrease the impact

Page 6: arXiv:2105.00336v1 [cs.LG] 1 May 2021

6 M. Tanveer et al.

of noise and outliers. TSVM, LS-TSVM, and ELS-TSVM satisfy the empiricalrisk minimization (ERM) principle and also in these techniques the matricesare positive semi-definite. Thus, to remediate this problem, Tanveer et al. [179]embodied the SRM principle in ELS-TSVM and proposed robust ELS-TSVM(RELS-TSVM) which makes the matrices positive definite. Moreover, RELS-TSVM introduces an energy parameter to handle the noise and thus make thealgorithm more robust. Results have shown the promising generalization per-formance of RELS-TSVM with less computational time when compare withthe baseline algorithms. In the recent comprehensive evaluation [178] of 187classifiers and 90 datasets, RELS-TSVM [179] emerges as the best classifier.Khemchandani and Sharma [73] proposed robust least squares TSVM (RLS-TSVM) which is fast and yields better generalization performance and is alsorobust to handle the noise. Further, the application in the field of human ac-tivity recognition is explored in the paper. The authors in [73] also proposedIncremental RLS-TSVM to increase the training speed of RLS-TSVM andIncremental RLS-TSVM also deals with noise and outliers effectively.

Tomar and Agarwal [197] proposed feature selection based LS-TSVM to di-agnose heart diseases using F-score to choose the most relevant features whichenhances the accuracy of the model. To handle imbalanced datasets with betteraccuracy, Tomar et al.[203] proposed a novel weighted LS-TSVM for imbal-anced data which has better accuracy and geometric mean than SVM andTSVM. To further enhance the accuracy on imbalanced datasets, Tomar andAgarwal [200] proposed Hybrid Feature Selection Based WLS-TSVM (HFSbased WLS-TSVM) approach for diagnosing diseases like Breast Cancer, Di-abetes and Hepatitis which can tackle data imbalance issues. Xu et al. [225]included data distribution information into the classifier (SLS-TSVM). It isbased on structural TSVM and performs clustering before classification andembody the structural information into the model. SLS-TSVM has a bettergeneralization performance than ν-SVM [18], ν-TSVM [107], STSVM [124]and LS-TSVM [78]. It also improves the noise insensitivity of LS-TSVM.

Mei and Xu [92] proposed a novel multi-task LS-TSVM algorithm which isbuild on directed multi-task TSVM (DMTSVM) and LS-TSVM. It focuses onmultitask learning instead of commonly applied single task learning in TSVMand LS-TSVM. This algorithm is computationally effective as it only requiresto solve linear equations in lieu of QPPs in DMTSVM.

Peng [106] proposed least square twin support vector hypersphere (LSTSVH)which is an enhancement of twin support vector hypersphere (TSVH) [119].TSVH is different from TSVM as it obtains two hyperspheres through solv-ing two small SVM-type problems. Experimental results demonstrate thatLSTSVH has almost same accuracy as SVM, LS-SVM, TSVM, LS-TSVM,and TSVH but has high computational time than LS-SVM and LS-TSVMand less than SVM and TSVM. In 2020, Tanveer et al. [187] proposed efficientlarge scale least squares twin support vector machines (LS-LSTSVM) whichuses different Lagrangian functions to eliminates the need for calculating theinverse matrices and thus enhance the performance of TSVM on large scaledatasets.

Page 7: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 7

3.2 Projection Twin Support Vector Machine

Projection TSVM (P-TSVM) [27] is a multiple-surface classification (MSC)technique based on the multi-weight vector projection SVM (MVSVM) andTSVM. In 2011, Chen et al. [27] introduced it with minimizing within classvariance principle. Unlike finding hyperplanes in TSVM, P-TSVM finds projec-tion axes to distinctly separate samples. To enhance performance of P-TSVM,authors proposed a recursive algorithm in which for each class multiple pro-jection axes are considered. P-TSVM is a better classifier as hyperplanes aremore fitted for single plane based samples while for datasets having complexsample distribution like XOR problems, projection directions can generatebetter accuracy. Experimental results have shown that P-TSVM has betteraccuracy than GEPSVM, TSVM, LS-TSVM, and MVSVM. However, it iscomputationally more complex than TSVM. Thus, Shao et al. [156] proposedimproved P-TSVM and used the idea in LS-SVM [172] and LS-TSVM [79] andadded a regularization term into P-TSVM simultaneously maintaining the op-timization problem to be positive definite. It is simple, fast and has similarclassification accuracy but less computational time than P-TSVM.

Although P-TSVM is an efficient classifier, it only implements empiri-cal risk minimization principle which can incur possible singularity problemswhich need to be dealt with principal component analysis (PCA) and lineardiscriminant analysis (LDA). Shao et al. [159] formulated a new P-TSVM vari-ant by introducing a maximum margin regularization term called RP-TSVMin which empirical risk is replaced by regularized risk. Authors further pro-posed a successive over-relaxation (SOR) technique and a genetic algorithm(GA) for parameter selection to optimally solve the primal problems. RP-TSVM not only has better accuracy but also has a better generalization thanP-TSVM.

To implement least square P-TSVM to non-linear problems, Ding and Hua[34] introduced kernel to LSP-TSVM [156] which also deals with linear equa-tions similar to LSP-TSVM and opposed to P-TSVM. Authors introducednonlinear recursive algorithm in order to improve its performance. Experi-mental results have shown good classification accuracy than LSP-TSVM andP-TSVM.

Guo et al. [48] proposed a feature selection approach for LSP-TSVM whichfinds two projection directions and has comparable prediction accuracy to thatof TSVM, LS-TSVM and LSP-TSVM and a similar generalization ability toTSVM and LS-TSVM.

LSP-TSVM enhances the performance of P-TSVM but fails to includeunderlying data information to enhance classification accuracy. To overcomethis limitation, Hua and Ding [55] proposed weighted LSP-TSVM (LIWLSP-TSVM) which exploits local information into the algorithm. LIWLSP-TSVM ismore effective than LSP-TSVM as it uses weighted mean rather than standardmean in LSP-TSVM. It has better generalization ability but its computationalcost is high when solving multi classification problems.

Page 8: arXiv:2105.00336v1 [cs.LG] 1 May 2021

8 M. Tanveer et al.

Hua et al. [56] formulated a novel projection TSVM (NP-TSVM). NP-TSVM has many advantages over P-TSVM. It is faster than P-TSVM ascalculation of inverse matrices is not avoided in NP-TSVM. It determines theprojection axes so that margin across the sample and class is greater than orequal to zero. Experimental results have shown that it obtains better general-ization and accuracy when compared with other baseline algorithms. In 2020,Chen et al. [24] formulated an efficient nonparallel sparse projection SVM(NPrSVM), which finds optimal projection for each class such that the pro-jection values of within class instance are clustered as much as possible withinan insensitive tube while those of other class instance are kept away. Exten-sive experiments showed that NPrSVM is superior than PTSVM in terms ofgeneralization performance, training time, sparsity and robustness to outliers.

3.3 Twin Parametric Margin Support Vector Machine

Motivated by the idea of TSVM and Par-ν-SVM [50], in 2011 Peng et al.[111] formulated twin parametric-margin SVM (TPMSVM). Like par-ν-SVM,TPMSVM seeks to determine two parametric margin hyperplanes which de-fines the positive and negative margin respectively. It’s an indirect classifiereven suitable when noise is heteroscedastic as it automatically adjusts a flexi-ble margin. Experimental results have shown that TPMSVM has comparablegeneralization ability to SVM, Par-ν-SVM, and TSVM. Also, TPMSVM hasmuch lower training time than Par-ν-SVM as it solves two small sized QPPslike TSVM. TPMSVM classifier is not sparse due to the formulations of itsparametric- margin hyperplanes. TPMSVM has good generalization but it iscomputationally complex as it solves two QPPs. In 2013, Wang et al. [214]added a quadratic function to TPMSVM in primal space to get better train-ing speed. The proposed algorithm is called smooth twin parametric-marginSVM (STPMSVM). Also, the authors proposed the use of genetic algorithm(GA) in STPMSVM to overcome the drawback of regularizing at least fourparameters in TPMSVM.

Shao et al. [158] in 2013 proposed LSTPMSVM to decrease the trainingcost of TPMSVM as LSTPSVM solves two primal problems rather than dualproblems. This change makes it less complex and increases training speed.It has comparable or better classification accuracy but with remarkably lesscomputational time than TPMSVM. For model selection, the authors pro-posed a particle swarm optimizer (PSO) which effectively optimizes the fourparameters defined in LSTPMSVM. In terms of generalization, LSTPMSVMperforms better than TPMSVM and LS-TSVM. In order to consider priorstructural data information, Peng et al. [115] in 2013 proposed structural twinparametric-margin SVM (STPMSVM). Based on cluster granularity, the classdata structures are included in the problem. STPMSVM not only obtainedgood generalization ability but also showed fast learning speed, and betterperformance than TPMSVM. TPMSVM is an efficient classifier but it losses

Page 9: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 9

the sparsity due to the weight vectors of the hyperplanes. To solve this is-sue, Peng et al. in 2015 [114] introduced Centroid-based TPSVM (CTPSVM)which uses the projection of the centroid points and leads to a sparse optimalhyperplane by optimizing the centroid projection values.

To incorporate structural information present in data, in 2017, Rastogiet al. [135] proposed robust parametric TSVM for pattern classification (RP-TSVM) which seek to find two parametric margin hyperplanes that has thecapability to adjust margin to capture heteroscedastic noise data information.Rastogi et al. [72], formulated angle-based TPSVM (ATPSVM) is proposed,which can efficiently handle heteroscedastic noise. Other improvements whichmaximizes angle between the twin hyperplanes are proposed by Rastogi et al.in [133]. Taking motivation from TPMSVM, ATPSVM determines a pair ofhyperplanes so that the angle between their normals is maximized. Recently,Richhariya and Tanveer proposed a novel angle based universum LS-TSVM(AULSTSVM) for pattern classification. In contrast to ATPSVM [133], theAULSTSVM minimizes the angle between universum hyperplane and classify-ing hyperplane. To incorporate the prior information about the distribution ofthe data using the universum, AULSTSVM used linear loss [152] in the formu-lation. Numerical experiments show the promising generalization performancewith very less computational time. Further, the application in the diagnosis ofAlzheimer’s disease is explored.

Moreover, the proposed AULSTSVM includes linear loss [152] in the op-timization problem, while incorporating prior information about data distri-bution using the universum. Moreover, the solution of AULSTSVM involvessystem of linear equations leading to less computation time [78]

3.4 Robust and Sparse Twin Support Vector Machine

TSVM achieves better accuracy, is faster when compared to conventional SVMbut it loses sparsity because it considers L2 norm of distances in the objectivefunction. Thus, in order to make TSVM sparse, Peng [110] in 2011 proposedTSVM in primal space which provides a sparse hyperplane with better gen-eralization ability. To improve the robustness, a regularization term is alsoadded. It has comparable generalization and a rapid learning speed to TSVMand LS-TSVM but the computational cost is high. In 2013, Peng and Xu [118]proposed robust minimum class variance TSVM (RMCV-TSVM) classifier toenhance generalization and robustness of TSVM. This algorithm has an extraregularization term which makes its learning speed comparable to TSVM butobtains better generalization ability than TSVM.

Qi et al. [123] in 2013 proposed robust-TSVM using second-order coneprogramming formulation. It is effective with noise-corrupted data. This al-gorithm overcomes the limitation of TSVM need to compute the inverse ofmatrices which are not suitable for large datasets, it takes only the innerproducts about samples by which kernel trick can be applied directly and

Page 10: arXiv:2105.00336v1 [cs.LG] 1 May 2021

10 M. Tanveer et al.

does not need to solve the extra inverse of matrices. It is superior in com-putational time and accuracy than other TSVMs. In 2014, Tian et al. [194]proposed sparse nonparallel support vector machine (SN-SVM). While TSVMlosses the sparseness, SN-SVM has the inherent sparseness as it uses the twoloss functions instead of one in existing TSVM. Only the empirical risk is con-sidered in TSVM, SN-SVM introduces the regularization term by maximizingthe margin. TSVM minimizes the loss function based on l1 or l2-norm. Thus,in 2014, Zhang et al. [253] proposed lp-norm-LS-TSVM which is an adaptiveas p can be automatically chosen by data.

In order to intensify the robustness and sparsity in the original formulationof TSVM, Tanveer [176] in 2015, incorporated regularization technique andformulated a linear programming 1-norm TSVM (NLP-TSVM) which needsto solve linear equations rather than solving QPPs in TSVM which makes itfast, robust, sparse and a simple algorithm. Experimental results demonstratethat NLP-TSVM’S generalization ability is better and computational time isless than GEPSVM, SVM, and TSVM. Tanveer [173, 174] proposed smoothingapproaches for 1-norm linear programming TSVM (SLPTSVM). The solutionof SLPTSVM reduces to solving two systems of linear equations which makesthe algorithm extremely simple and computationally efficient. Tanveer et al.[192] proposed sparse pinball TSVM (SP-TSVM) by introducing ǫ insensitivezone pinball loss function in the orginal TSVM formulation. SP-TSVM is noiseinsensitive, retain sparsity and more stable for re-sampling.

3.5 Pinball Loss Twin Support Vector Machine

In 2016, Xu et al. [231] proposed a novel TPMSVM with the pinball loss(Pin-TPMSVM). The authors introduced pinball loss function that is basedon maximizing quantile distances between the two classes instead of hingeloss function which maximizes the distance between the closest samples ofthe two classes. This leads to noise insensitivity and as well as re-samplingstability. Pin-TSVM has excellent capability to handle noise which makes themodel a more robust classifier but little attention is given on sparsity due towhich the testing time is high and there is instability in re-sampling. Xu et al.[232] also formulated a maximum margin and minimum volume hypersphereswith pinball loss (Pin-M3HM). Authors proposed this algorithm to enhancethe generalization of twin hypersphere SVM (TH-SVM) for noise present indatasets. The algorithm classifies samples of two classes by a pair of hyper-spheres, each containing either majority or the minority class samples simulta-neously maintaining maximum margin between the hyperplanes. Pin-M3HMis fast and has better accuracy than TSVM. In 2018, Sharma and Rastogi [163]proposed two models called ǫ-Pin-TSVM and Flex-Pin-TSVM. ǫ-Pin-TSVMintroduces ǫ parameter to reduce the impact of noise and to attain a sparsesolution while Flex-Pin-TSVM uses a self-optimized framework which makesthis algorithm flexible to estimate the size of insensitive zone. It adapts to the

Page 11: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 11

structure of data. Pin-TPMSVM has impressive ability to handle noise butto reduce parameter tuning time, Yang et al. [238] in 2018 introduced a newsolution approach for the TPMSVM in which one instance is considered at atime and it is solved analytically without solving optimization problem. It isfast, simple and flexible.

For imbalanced data classification in 2018, Xu et al. [229] proposed amaximum margin twin spheres which uses pinball loss. This algorithm findshomocentric spheres so that the smaller one captures positive class samplesand the larger one repel negative samples. It requires to solve a QPP and aLinear programming (LP) problem. This algorithm has good generalizationperformance and noise insensitivity. In 2019, Sharma et al. [165] proposed aStochastic Quasi-Netwon method based TPSVM using Pinball Loss Function(SQN-PTWSVM) which can scale the training process to handle millions ofdata points while simultaneously deals with noise and re-sampling data issues.

The aforementioned pinball loss TSVM algorithms are based on TPMSVMand not on the original TSVM algorithms. In 2019, Tanveer et al. [185] in-troduced pinball loss to the original TSVM, termed as general TSVM withpinball loss (Pin-GTSVM). Pin-GTSVM [185] is less sensitive to the outliersand more stable algorithm for re-sampling as compared to the original TSVM.To retain the sparsity of original TSVM and Pin-GTSVM, Tanveer et al. [192]proposed a novel sparse pinball TSVM (SP-TSVM) which uses ǫ-insensitivezone pinball loss function. SP-TSVM has better classification accuracy thanTSVM and Sparse Pin SVM and it is insensitive to outliers, retains sparsenessand suitable for re-sampling. Recently, Tanveer et al. [183] proposed improvedsparse pinball TSVM (ISPTSVM) by adding a regularization term to the ob-jective function of SP-TSVM. ISPTSVM implements SRM principle and alsothe matrices appear in the dual formulation are positive definite which makesthe proposed algorithm computationally less complex and it also achieves bet-ter accuracy than other baseline algorithms.

3.6 Twin Support Vector Machine for Universum Data and ImbalancedDatasets

In 2012, Qi et al. [122] formulated TSVM for universum data classification(U-TSVM). Universum data is defined as the data samples which does notbelong to any given class. This algorithm utilizes the universum data to en-hance TSVM classification accuracy. U-TSVM employs a new data point tothe either basis its proximity to the hyperplanes. Experimental results haveshown that U-TSVM has better accuracy than TSVM. In order to construct arobust classifier by including the prior information embedded in the universumsamples in 2014, Qi et al. [125] formulated nonparallel SVM (U-NSVM) whichmaximizes the two margins related to the two nearest adjacent classes.

As many real life problems consists of datasets which are imbalanced innature i.e. classes do not contain same number of data samples, due to which

Page 12: arXiv:2105.00336v1 [cs.LG] 1 May 2021

12 M. Tanveer et al.

many machine learning algorithms cannot be implemented. Thus, to enhancethe performance of TSVM while dealing with imbalanced datasets, in 2014Shao et al. [153] proposed weighted Lagrangian TSVM for imbalanced dataclassification. Authors introduced a quadratic loss function to the formulationof TSVM which enables faster training of data points. Also, this is more robustto outliers and has better computational speed and accuracy than TSVM. In2014, Tomar et al. [203] proposed weighted LS-TSVM for imbalanced datasetsby adjusting the classifier and assigning distinct weight to training samples.The results of the experiment shows that its accuracy is more than SVM,TSVM, and LS-TSVM. Furthermore, in 2015, Xu et al. [219] formulated leastsquares TSVM (ULS-TSVM) which exploits the universum data. It introducesa regularization term thus implements SRM principle and is computationallyless complex as it requires to solve linear equations in place of QPPs in TSVM.Cao and Shen [13] proposed combining re-sampling with TSVM for imbalanceddata classification. In this algorithm, authors presented a hybrid re-samplingtechnique which utilizes the one side selection (OSS) algorithm and syntheticminority oversampling technique (SMOTE) algorithm to balance the trainingdata. This is implemented combined with TSVM for classification purpose.However, SMOTE algorithm based on K-nearest neighbors can often result inover-fitting.

Encouraged by the performance of ν-TSVM [107] and U -TSVM [122], Xuet al. [220] in 2016 proposed ν-TSVM for universum data classification (Uν-TSVM). It is more flexible in using the prior knowledge from universum datato improve the generalization ability. It uses two hinge loss functions so thatdata can remain in a nonparallel insensitive loss tube. Experimental resultshave shown that Uν-TSVM has better accuracy and also costs lower runningtime than other baseline algorithms.

In 2017, Xu [218] proposed maximum margin twin spheres SVM algorithm(MMTSSVM) to overcome many limitations of TSVM while dealing with im-balanced data. This algorithm seeks to determine two homocentric spheresso that smaller one captures as many positive class samples and the largerone repel negative samples simultaneously increases the margin between thespheres.

In 2018, Richhariya et al. [138] added a regularization term into the op-timization problem of universum TSVM to make matrices non-singular andimprove the generalization performance. The proposed algorithm is called im-proved universum TSVM. It has better generalization and training time thanUSVM and UTSVM. In 2019, Richhariya and Tanveer [141] proposed a fuzzyuniversum SVM (FUSVM) based on information entropy. Universum supportvector machines are more efficient than other SVM methods as it includesprior information of samples. But, it is not effective in case of noise-corrupteddatasets. FUSVM introduces weights such that it gives fewer weights to theoutlier universum points. Further, the authors proposed a fuzzy universumTSVM (FUTSVM). Both the algorithms have better generalization perfor-mance than SVM, USVM, TSVM, and UTSVM. Recently, Richhariya andTanveer [143] proposed a reduced universum twin support vector machine for

Page 13: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 13

class imbalance learning (RUTSVM-CIL) with the idea of prior informationabout the data distribution. RUTSVM-CIL used reduced kernel matrix, andthus applicable for the large sized imbalanced datasets. Numerical experi-ments on several imbalanced benchmark datasets showed the applicability ofRUTSVM-CIL. In 2020, Richhariya and Tanveer [144] also proposed a novelparametric model for universum based twin support vector machine and ex-tended its application for the classification of Alzheimer’s disease data.

3.7 Fuzzy Twin Support Vector Machines

In 2017, Chen and Wu [20] proposed a novel fuzzy TSVM (NFTSVM) whichtakes care of the problem of dealing with one class playing major role in clas-sification, by assigning fuzzy membership to different samples. The proposedNFTSVM includes fuzzy neural networks and provides more generalized re-sults. In 2019, Richhariya and Tanveer [141] proposed a fuzzy universum SVM(FUSVM) based on information entropy. This algorithm is also discussed inSection 3.6. Richhariya and Tanveer [140] also proposed a robust fuzzy LS-TSVM (RFLS-TSVM-CIL) to boost the performance of TSVM on imbalanceddatasets. The optimization problem in this algorithm is strictly convex as ituses 2-norm of the slack variables. Authors further proposed a fuzzy mem-bership function to deal with imbalanced problems as it provides weights tosamples and includes data imbalanced ratio. RFLS-TSVM-CIL obtains bettergeneralization and is computationally more efficient as it requires to solve twolinear equations. However, RFLSTSVM-CIL algorithm implements empiricalrisk minimization principle which requires inverse of matrices to be positivesemi-definite. In order to improvise, Ganaie et al. [44] proposed adding a reg-ularization term to the primal formulation of RFLSTSVM-CIL. The proposedalgorithm is Regularized robust fuzzy least squares (RRFLSTSVM) whichdoesn’t require the extra assumption of inverse of matrices to be positivesemi-definite and performs better than RFLSTSVM-CIL.

In 2018, Khemchandani et al. [70] formulated fuzzy LS version of TSVCin which each data point has a fuzzy membership value and is allotted to dif-ferent clusters. This algorithm is also discussed in Section 3.11. In 2020, Chenet al. [19] proposed entropy-based fuzzy least squares TSVM which considersthe fuzzy membership for each data point basis entropy. Rezvani and Wang[137] formulated intuitionistic fuzzy TSVM (IFTSVM) which is modified ver-sion of fuzzy TSVM as it considers the position of input data in the featurespace and calculate adequate fuzzy membership and also reduces the effect ofnoise. Zhang and Li [250] proposed fuzzy TSVM which assigns fuzzy member-ship based on intra-class hyperplane and sample affinity. Experimental resultsshowed that this algorithm has better accuracy than TSVM and classic fuzzyTSVM. In 2019, Gupta et al. [49] proposed entropy-based fuzzy twin supportvector machine (EFTSVM-CIL) which is an efficient algorithm for imbalanceddatasets as it assigns fuzzy membership values based on entropy of the sam-ples. In 2020, Chen at el. [19] proposed entropy-based fuzzy least squares

Page 14: arXiv:2105.00336v1 [cs.LG] 1 May 2021

14 M. Tanveer et al.

twin support vector machine (EFLSTSVM) which is an improvised versionof EFTSVM-CIL by formulating the QPPs in least squares sense and retainsthe superior characteristics of LSTSVM. Experimental results shown that thisalgorithm performed better than other baseline fuzzy TSVM algorithms.

3.8 Some other improvements of Twin Support Vector Machines

Kumar and Gopal [77] in 2008 enhance TSVM using smoothing techniques.Authors proposed to transform the primal problems of TSVM into smoothunconstrained minimization problems and used Newton−Armijo algorithmfor optimization. Smooth TSVM (STSVM) has comparable generalization toTSVM but is significantly a faster algorithm. Jayadeva et al. [65] in 2009 pro-posed iterative alternating algorithm to make kernels learn efficiently. In 2009,Zhang [251] proposed boosting based TSVM for clustered microcalcificationsdetection. Results have shown that this method improved detection accuracy.Bagging algorithm was combined with boosting to solve the unstable problemof TSVM. This method is called BB-TSVM.

In 2010, Shao et al. [160] proposed a bi-level programming method to multi-ple instance classification, called MI-TSVM. Multiple instance learning (MIL),a supervised technique wherein the training data points consist of labeled bagsand these bags include multiple unlabeled instances. For binary classification,a bag is a negative sample if all the instances in it are negative and it is labeleda positive sample if at least one instance is positive. Similar to TSVM, the pro-posed algorithm seeks to find two hyperplanes thereby the positive hyperplaneis closest to the positive instance while as distant as possible from the nega-tive instance and vice-versa. In 2010, Ghorai et al. [46] proposed a unity normTSVM classifier (UNTSVM) which includes unity norm equality constraintsand a quadratic loss function in the objective function of TSVM. This algo-rithm can be solved by sequential quadratic optimization method. UNTSVMhas more computational cost than TSVM especially for large datasets becauseof the nonlinear formulation.

In order to improve TSVM’s generalization ability and to decrease supportvectors (SVs), in 2010 Peng [107] introduced a variables p and parameterν in TSVM and proposed ν-TSVM. The parameter ν controls the trade-offbetween the SV and marginal error while adaptive p overcomes the limitationsof constraints being over-restricted in TSVM and thus reduces the number ofsupport vectors. But, it gives the same penalties to each misclassified sampleand results in over-fitting. Thus, Xu et al. [228] in 2012 introduced the roughset theory into ν-TSVM to remediate this limitation. This algorithm is moreeffective in avoiding over-fitting as it gives different penalties to different pointsdepending upon location. It also has better generalization than ν-TSVM.

Although rough ν-TSVM [228] performs better than ν-TSVM, it providesdifferent weights to negative samples while gives same weights to all the pos-itive samples. Thus, Xu et al. [233] formulated K nearest neighbor (KNN)weighted rough ν-TSVM which gives different penalties both class samples.

Page 15: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 15

It has better accuracy and low computational cost than ν-TSVM and Roughν-TSVM. In another attempt to enhance the results of ν-TSVM, Khemchan-dani et al. [71] formulated two models for binary classification: IνTSVM andIνTSVM Fast. Both the algorithms are faster than ν-TSVM as IνTSVMsolves one small QPP and an unconstrained minimization problem (UMP)while IνTSVM Fast solves one unimodal function and one UMP. To overcomethe disadvantage of TSVM being sensitive to outliers, Xie and Sun [216] im-plemented class centroid and proposed multitask centroid TSVM for multitaskproblems.

For handling large scale datasets, Shao et al. [152] introduced weightedlinear loss in TSVM and proposed weighted linear loss TSVM (WL-TSVM)to classify large scale datasets. The proposed algorithm only deals with linearequations which increase the training speed and also has better generalizationthan TSVM. To further fasten up the training process, Sharma et al. [164],recently proposed stochastic conjugate gradient method based twin supportvector machine (SCH-TSVM). The resulting model showed effective perfor-mance on binary activity classification problem.

Shao et al. [161] in 2011 introduced the regularization term in TSVM toembody SRM principle and formulated a new algorithm called twin boundedsupport vector machines (TBSVM). To further speed up training, SOR tech-nique is used. Numerical results on various datasets shown that TBSVM isfaster and has better generalization ability than TSVM. In 2010, Ye et al.[245] formulated localized TSVM via convex minimization (LC-TSVM) whicheffectively constructed two nearest-neighbor graphs in the original input spaceto reduced the space complexity of TSVM. Shao and Deng [155] in 2012 consid-ered the unity norm constraints and added a regularization term to minimizestructural risk in TSVM and proposed margin-based TSVM with unity normhyperplanes (UNHMTSVM). The proposed algorithm is fast and has bettergeneralization and accuracy as compared to TSVM.

To include statistical data information in TSVM, Peng and Xu [116] formu-lated twin Mahalanobis distance-based SVM (TMSVM) that uses each classcovariance to determine hyperplanes. It has better training speed and gener-alization than TSVM. Shao et al. [157] in 2012 introduced a probability basedTSVM model (P-TSVM) whose accuracy is more than TSVM. In 2012, Shaoand Deng [154] formulated a coordinate descent based TSVM to increase theefficiency of TSVM by introducing a regularization term. Further, to solvethe dual problems, the authors proposed a coordinate descent method whichreduces the training time even in case of large datasets as it deals with singledata point at once.

To include the underlying correlation information between data points inTSVM, in 2012, Ye et al. [244] formulated a novel nonparallel plane classifier,called Weighted TSVM with Local Information (WL-TSVM). A major limita-tion of this algorithm is that it doesn’t work for large-scale problems as it hasto find the K-nearest neighbors for all the samples. Based on TSVM, in 2013,Peng and Xu [119] formulated a twin-hypersphere SVM (TH-SVM). Unlike

Page 16: arXiv:2105.00336v1 [cs.LG] 1 May 2021

16 M. Tanveer et al.

TSVM, it determines two hyperspheres and it avoids the matrix inversions inits dual formulations.

To incorporate the structural data information in TSVM, Qi et al. [124]formulated structural-TSVM which is superior in training time as well as ac-curacy to that of TSVM. However, it still ignores the importance of differ-ent samples within each cluster. To overcome this drawback, in 2015, Pan etal. [101] K-nearest neighbor based structural TSVM which provides differentpenalties to the samples of different classes. This also increases the testingaccuracy.

In 2014, Shao et al. [150] formulated a different nonparallel hyperplaneSVM in which hyperplanes are determined by clustering the samples basedon similarity between the two classes. This algorithm has better or compara-ble accuracy with low computational time. To optimally select the parametersin TSVM, in 2016 Ding et al. [36] proposed TSVM build on Fruit Fly Op-timization Algorithm (FOA-TSVM). It can optimally select the parametersin less time and with better accuracy. In 2017, Pan et al. [102] proposed safescreening rules to make TSVM efficient for large-scale classification problems.The safe screening rules reduce the scale by eliminate training samples andgives same solution as the original problem. Experimental results showed thatsafe screening rules can greatly reduce the computational time while givingthe same solutions as original ones. Yang and Xu [240] proposed safe samplescreening rule (SSSR) for Laplacian twin parametric-margin SVM (LTPSVM)to address the problems while handling large-scale problems. In 2018, Pangand Xu [104] proposed safe screening rule for Weighted TSVM with local in-formation (WLTSVM)) to implement the algorithm for large-scale datasets.Experimental results demonstrates the effectiveness of SSR for WLTSVM as itperformed better than SVM and TSVM with significantly less computationaltime. Recently, Zhao et al. [254] proposed an efficient non-parallel hyperplaneUniversum support vector machine (U-NHSVM) for classification problems.U-NHSVM is flexible to exploit the prior information in universum.

Tanveer [174] proposed unconstrained minimization problem (UMP) for-mulation of Linear programming TSVM to enhance robustness and sparsityin TSVM. Tanveer [175] also proposed an implicit Lagrangian TSVM whichis solved by using finite Newton method. Tanveer and Shubham [190] in 2017proposed smooth TSVM via UMP which increases the generalization abilityand training speed of TSVM.

Multi-label learning deals with data having multiple labels and has gaineda lot of attention recently. Chen et al. [23] in 2016 proposed multi-label TSVM(MLTSVM) that exploits multi-label information from instances. In 2020,Azad–Manjiri et al. [4] proposed structural twin support vector machine formulti-label learning (ML-STSVM) which embeds the prior structural informa-tion of data into the optimization function of MLTSVM based on the sameclustering technology of S-TSVM. This algorithm achieved better performancecompared to other baseline multi-label learning algorithms.

In 2018, Rastogi et al. [131] proposed a new loss function termed as (ǫ1, ǫ2)-insensitive zone pinball loss function which generalizes other existing loss func-

Page 17: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 17

tions e.g. pinball loss, hinge loss. The resulting model takes care of noise in-sensitivity, instability of re-sampling and scatterdness present in the datasets.In 2019, Tanveer et. al [178] presented rigorous comparison of 187 classifierswhich includes 8 variants of TSVM and exhaustive evaluation of these classi-fiers was performed on 90 UCI benchmark datasets. Results have shown thatRELS-TSVM achieved highest performance than all other classifiers for binaryclassification task. Thus, RELS-TSVM is the best TSVM classifier. For moredetails, one can refer to [178]. In 2020, Ganaie and Tanveer [42] proposed anovel classification approach using pre-trained functional link to enhance thefeature space. Authors performed the classification task by LSTSVM on theenhanced features and validated the performance on various datasets. Someother recent research on TSVM include [43] where authors proposed a novelway for generating oblique decision trees. The classification of training samplesis done based on the Bhattachrayya distance with randomly selected featuresubset and then hyperplanes are generated using TBSVM. The major advan-tage of the proposed model is that there is no need for any extra regularizationas matrices are positive definite.

3.9 Twin Support Vector Machine for Multi-class Classification

Earlier, TSVM was only implemented to solve binary classification cases,however, the majority of problems in the real-world applications are gener-ally based on multicategory classification. Thus, Xu et al. [222] formulatedTwin-KSVC. It implements “1-versus-1-rest” form to provide ternary output(−1, 0, 1). This algorithm requires to solve two smaller-sized QPPs. It has bet-ter accuracy than 1-versus-rest TSVM but losses sparsity. Yang et al. [242] in2013, proposed multiple birth SVM (MBSVM). It is computationally betterthan TSVM even when number of classes are large.

In 2013, Xie et al. [215] also extended TSVM application for multi-classproblems and proposed one-versus-all TSVM (OVA-TSVM) which solve k-category problem using one-versus-all (OVA) approach to develop k TSVM.To strengthen the performance of multi-TSVM, Shao et al. [151] formulateda separating decision tree TSVM (DTTSVM). The basic idea of DTTSVM isto embody the best separating principle rule to create a binary tree and thenbuilt binary TSVM model on each node. Experimental results have shown thatDTTSVM has low computational complexity and better generalization.

Xu and Guo [221] in 2014 formulated twin hyper-sphere multi-classificationSVM (THKSVM) which employs the “rest-versus-1” structure instead of “1-versus-rest” structure in TSVM. In order to find hyperspheres, it constructsk (no. of classes) classifiers and finds k centers and k radiuses for each hy-persphere. Each hypersphere covers maximum points of K1 classes and is asdistant as possible from the rest classes. It has fast computational speed ascompared to Twin-KSVC and also inverse operation of matrices are not re-quired while solving dual QPPs of THKSVM. Thus, it performs better onlarge datasets and has better accuracy than“1-versus-rest” TSVM but lower

Page 18: arXiv:2105.00336v1 [cs.LG] 1 May 2021

18 M. Tanveer et al.

than Twin-KSVC. To enhance the performance of Twin-KSVC, in 2015, Nasiriet al. [96] formulated LS version of Twin-KSVC that works similar to Twin-KSVC but it only requires to solve linear equations rather than pair of QPPsin Twin-KSVC. The proposed algorithm is fast, simple and has better accu-racy and low training time than Twin-KSVC. Another approach to enhancethe performance of Twin-KSVC and to include local information of samples, in2016, Xu [217] proposedK-nearest neighbor-based weighted multi-class TSVMwhich uses information from within class and applies a weight in the objectivefunction. This algorithm has low computational cost and better accuracy thanTwin-KSVC.

Based on the multi-class extension of the binary LS-TSVM, in 2015, Tomarand Agarwal [199] proposed weighted multi-class LSTSVM algorithm for multi-class imbalanced data. To control sensitivity of classifier, weight setting isemployed in loss function for determining each hyperplane. Experimental re-sults have shown the superiority and feasibility of the proposed algorithm formulti-class imbalanced problems. The authors [198] also extended LS-TSVMfor multi-class classification and compared various concepts of a multi-classclassifier like “One-versus-All”, “One-versus-One”, “All-versus-One” and “Di-rected Acyclic Graph (DAG)”. DAG MLS-TSVM performance is superior andhas high computational efficiency.

In 2016, Yang et al. [241] formulated Least square recursive projectionTSVM. For each class, it determines k group of projection axes and needsto solve linear equations. It has similar performance as MP-TSVM. Based onP-TSVM, Li et al. [84] in 2016, proposed multiple recursive projection TSVM(Multi-P-TSVM) which solves k QPPs in order to determine k-projection axes(for k classes) . Authors introduced regularization term and recursive proce-dure which increases the generalization but this algorithm is complex whenmore orthogonal projection axes are generated.

In 2017, Ding et al. [37] review various multi-class algorithms as per theirstructures: “one-versus-rest”, “one-versus-one”, “binary tree”, “one-versus-one-versus-rest”, “all-versus-one” and “direct acyclic graph” based multi-classTSVM. All these multi-class TSVMs have some advantages and disadvantages.In general, One-versus-One TSVMs have higher performance. Ai et al. [2] in2018 proposed a multi-class classification weighted least squares TSVH usinglocal density information in order to improve the performance of LSTSVH. Au-thors introduced local density information into LSTSVH to provide weight foreach data point in order to avoid noise sensitivity. In 2018, Pang et al. [103] pro-posed k-nearest neighbor-based weighted multi-class TSVM (KMTSVM) thatincorporated ”1-versus-1-versus-rest” strategy for multi-class classification andalso takes into account the distribution of all instances. However, it is compu-tationally extensive especially for large-scale problems. Thus, authors in [103]also proposed safe instance reduction rule (SIR-KMTSVM) to reduce its com-putational time. Lima et al. [88] proposed improvements on least squares twinmulti-class classification SVM which is “one-versus-one-versus-rest” strategyand generated ternary output. The proposed algorithm only needs to deal withlinear equations. Numerical results demonstrate that it achieves better classifi-

Page 19: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 19

cation accuracy than Twin-KSVC [222] and LSTKSVC [96]. Qiang et al. [126]also proposed improvement on LSTKSVC by proposing robust weighted lin-ear loss twin multi-class support vector regression (WLT-KSVC) which takescare of the two drawbacks of LSTKSVC; sensitive to outliers and misclassify-ing some rest class samples due to the use of quadratic loss. Experiments onthe UCI and NDC datasets showed promising results of this algorithm but itstraining accuracy significantly decreases as the number of classes increases. Liet al. [86] proposed a nonparallel support vector machine (NSVM) for mul-ticlass classification problem. Numerical experiments on several benchmarkdatasets clearly show the advantage of NSVM. In 2020, Tanveer et al. [186]proposed a fast and improved version of KWMTSVM [227] called least squareK-nearest neighbor weighted multi-class TSVM (LS-KWMTSVM). Numeri-cal experiments on various KEEL imbalance datasets showed high accuracyand low computational time for the proposed LS-KWMTSVM as comparedto other baseline algorithms.

3.10 Twin Support Vector Machine for Semi-Supervised Learning

Semi-supervised learning (SSL) techniques have achieved extensive attentionfrom many researchers in the last few years due to its promising applications inmachine learning and data mining. In many real-world challenges, labeled datais not easily available and thus it deteriorates the performance of supervisedlearning algorithms due to insufficient supervised information. SSL overcomesthis limitation and uses both unlabeled and labeled data.

In 2012, Qi et al. [121] formulated a novel Laplacian –TSVM for the semi-supervised classification problem. This algorithm assumes that data points liein low dimensional space and uses the geometric information of the unlabeleddata points. Similar to TSVM it solves a pair of QPPs with inverse matrix op-erations. Results have shown that Lap-TSVM has better flexibility, predictionaccuracy and generalization performance than conventional TSVM. This al-gorithm has excellent performance for semi-supervised classification problemsbut due to solving QPPs and required inverse matrix operations, its computa-tional cost is high. Thus, to increase the training speed of Lap-TSVM, Chen etal. [21] in 2014 formulated LS version of Lap-TSVM (Lap-LS-TSVM). UnlikeLap-TSVM, it needs to deal with linear equations and is done using conju-gate gradient algorithm. It has less training time than Lap-TSVM. Anotheralgorithm was formulated by Chen et al. [22] in 2014 proposed improve the per-formance of Lap-TSVM by transforming the QPPs to UMPs in primal spaceand smoothing method is used which effectively solved by Newton-Armijo algo-rithm. It achieved comparable accuracy as Lap-TSVM with less computationaltime. In 2016, Khemchandani and Pal [67] extended this to multi-class clas-sification and formulated Lap-LS-TSVM which evaluates training samples to“1-versus-1-versus-rest” and provides ternary output (−1, 0,+1). It has betteraccuracy and less training time than Lap-TSVM.

Page 20: arXiv:2105.00336v1 [cs.LG] 1 May 2021

20 M. Tanveer et al.

Based on the similar idea of Lap-TSVM, in 2016, Yang and Xu [239] pro-posed an extension of traditional TPMSVM, which makes use of graph Lapla-cian and creates a connection graph of the training data points whose solutioncan be obtained by solving two SVM-type QPPs. Experiments reveal thatLTPMSVM has higher classification accuracy than SVM, TPMSVM, Lap-TSVM, and TSVM.

Khemchandani and Pal [69], in 2017, blended the Laplacian–TSVM andDecision Tree –TSVM classifier and formulated a tree based classifier for semi-supervised multi-class classification. Extensive experiments on color imagesshows the feasibility of the model. Another interesting approach in this di-rection in 2019, has been suggested by Rastogi and Sharma [134] called asFast Laplacian TSVM for Active Learning (FLap − TWSVMAL) where theauthors proposed to identify the most informative and representative trainingpoints whose labels are queried for domain experts for annotations. Once thecorresponding labels are acquired, this limited labeled and unlabeled data areused to train a fast model that involves solving a QPP and an unconstrainedminimization problem to seek the classification hyperplanes.

3.11 Twin Support Vector Machine for Clustering

Wang et al. [212] formulated twin support vector clustering algorithm (TSVC)build upon TSVM. It divides the data samples into k clusters such that thedata samples are around k cluster center points. It exploits the informationfrom clusters (both between and within clusters) and the center planes aredetermined by solving a series of QPP. Authors in [212] also proposed a nearestneighbor graph (NNG)-based initialization to make the model more stable andefficient.

In 2018, Khemchandani et al. [70] formulated fuzzy least squares version ofTSVC in which each data point has a fuzzy membership value and is allottedto different clusters. The proposed algorithm solves primal problems insteadof dual problems in TSVC. Experimental results shown that the proposedalgorithm obtains comparable clustering accuracy to that of TSVC but hasless training time.

Khemchandani and Pal [68] replaced the hinge loss function of TSVC withweighted linear loss and introduced a regularization term in TSVC which needsto solve linear equations and also implements the structural risk component.The proposed algorithm called weighted linear loss TSVC achieves higher ac-curacy than TSVC. The loss function proposed in [68] is not continuous andtherefore, authors in [130] modified the weighted linear loss function to forma continuous loss function and implemented TSVC and WLL-TSVC in semi-supervised framework and also introduced a fuzzy extension of semi-supervisedWLL-TSVC to make a robust algorithm which is less sensitive to noise. Ex-perimental results have demonstrated that the proposed algorithm achievedbetter clustering accuracy and less computational time than TSVC and WLL-TSVC. Tree-based localized fuzzy twin support vector clustering (Tree-TSVC)

Page 21: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 21

was proposed by [132]. Tree-TSVC is a novel clustering algorithm that buildsthe cluster that represents a node on a binary tree, where each node comprisesof proposed TSVM based classifier. Due to the tree structure and the formu-lation that leads to solving a series of systems of linear equations, Tree-TSVCmodel is efficient in terms of training time and accuracy.

TSVC uses squared L2-norm distance which leads to noise sensitivity. Also,each cluster plane learning iteration requires to solve a QPP which makes itcomputationally complex. To address these issues in TSVC, Ye et al. [246] in2018 used L1 norm distance and proposed L1-norm distance minimization-based robust TSVC which only deals with linear equations instead of a seriesof QPPs. Authors in [246] further proposed RTSVC and Fast RTSVC to speedup the computation of TSVC in nonlinear case. Numerical experiments shownthat this model has higher accuracy as compared to other k-clustering methodsand has less computational time than TSVC. Wang et al. [256] proposed ramp-based TSVC by introducing the ramp cost function in TSVC. The solution ofthe proposed algorithm is obtained by solving non-convex programming prob-lem using an alternating iteration algorithm. In 2019, Bai et al. [5] introducedregularization in clustering and proposed large margin TSVC. Authors in [5]also proposed a fast least squares TSVC with uniform coding output. Thealgorithm achieves better performance than TSVC and other plane-clusteringmethods. Wang et al. [213] in 2019 proposed a general model for plane-basedclustering which includes various extensions of k-plane clustering (kPC). Itoptimizes the problem by minimizing the total loss which is derived from bothwithin-cluster and between-cluster. It can capture data distribution accurately.In 2020, Richhariya and Tanveer [145] proposed projection based Least squareTSVC (LSPTSVC) and employed the concave-convex procedure (CCCP) tosolve the optimization problem. LSPTSVC only needs to solve a set of linearequations and thus leads to significantly less computational time.

4 Applications of Twin Support Vector Classification

TSVM has been implemented to solve many real-life classification challengesand has shown promising results in some of its applications. TSVM being aclassifier has been applied for many pattern classification problems and re-searchers have proposed many improved algorithms and variants for patternclassification. [17, 20, 60, 61, 78, 89, 106, 107, 111, 116, 117, 123, 147, 195, 196]

Biomedical applications-TSVM has been applied to detect breast cancer in early stages by detecting amass in digital mammograms [167]. A hybrid feature selection based approachhas also been implemented for detecting breast cancer, Hepatitis, and Diabetes[200]. Wang et al. [209] proposed TSVM in combination with dual-tree complexwavelet transform (DTCWT) method for Pathological Brain Detection. TSVMhas also implemented for detecting cardiac diseases, one such application wasproposed by Houssein et al. in [53], heartbeats were detected using Swarm-TSVM and this algorithm achieved better accuracy than TSVM. Refahi et al.

Page 22: arXiv:2105.00336v1 [cs.LG] 1 May 2021

22 M. Tanveer et al.

[136] used LSTSVM and DAG LS-TSVM classifiers for predicting arrhythmiaheart disease. Chandra and Bedi [15] proposed linear norm fuzzy based TSVMfor color based classification of human skin which achieved better accuracythan other conventional methods. Xu et al. [234] proposed semi-supervisedTSVM for detection of Acute-on-chronic liver failure (ACLF). Also, manyresearchers have applied TSVM to detect Alzheimer in its early stages [3, 184,197, 201, 202, 208–211, 252]

Alzheimer’s Disease Prediction-In 2020, Richhariya and Tanveer [142] proposed an angle based universumleast squares TSVM (AULSTSVM) which performed with 95% accuracy fordetecting Alzheimer’s disease (AD). Authors [146] also proposed universumsupport vector machine based recursive feature elimination (USVM-RFE) fordetecting AD. Khan et al. [63] proposed an approach to improve the clas-sification accuracy mild cognitive impairment (MCI), normal control (NC),and AD using structural magnetic resonance imaging (sMRI). Authors usedFreeSurfer to process MRI data and derive cortical features which are used inTSVM, LSTSVM and RELS-TSVM to detect AD.

Speaker Recognition-Cong et al. [30] formulated multi-class TSVM for speaker recognition withfeature extraction based on gaussian mixture models (GMMs). It gives betterresults than the traditional SVM. Zhendong and Yang [235] proposed multi-TSVM which find hyperplane for every class and takes constraints from otherclasses separately on the QPP. This algorithm performed better than manyother algorithms for speaker recognition.

Text Categorization-Kumar and Gopal [78] formulated a least squares (TSVM) and experimentshave shown the validity of the model for text applications. Francis and Sreenath[41] proposed manifold regularized TSVM for text recognition and the pro-posed method achieved highest accuracy among SVM, LSTSVM and othermethods.

Intrusion Detection-It is a system which monitors or protects the network against any maliciousactivity. It has been a critical component for network security. Researchers[38, 51, 94, 98] applied TSVM for intrusion detection and results have shownthat it achieves better accuracy than other intrusion detection algorithms.

Human activity recognition-Khemchandani and Sharma [73] proposed least square TSVM for human ac-tivity recognition which gives promising results even with the outliers. Nasiriet al. [97] formulated energy-based LS-TSVM algorithm. Khemchandani andSharma [74] also proposed robust parametric twin support vector machinewhich can effectively deal with the noise. Mozafari et al. [95] used the Harrisdetector algorithm and applied LS-TSVM for action recognition and achievedthe highest accuracy than other state-of-the-art methods. Kumar and Ra-jagopal [80] proposed Multi-class TSVM for detecting human face happinesscombined with Constrained Local Model. Authors [81] also proposed semi-supervised multi TSVM to predict human facial emotions with 13 minimal

Page 23: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 23

features that can detect six basic human emotions. Algorithm achieved high-est accuracy and least computation time with minimal feature vectors.

Image Denoising-Yang et al. [237] proposed edge/texture-preserving image denoising based onTSVM which is very effective to preserve the informative features such asedges and textures and better than other image denoising methods available.Shahdoosti and Hazavei [149] proposed a ripplet formulation of the total vari-ation method for denoising images. This algorithm attains promising resultsin improving the image quality in terms of both subjective and objective in-spections.

Electroencephalogram (EEG)-Classification of electroencephalogram (EEG) for different mental activitieshas been an active research topic. Richhariya and Tanveer [139] proposed uni-versum TSVM which is insensitive to outliers as it selects universum fromthe EEG datasets itself to generate universum data points which remove theeffect of outliers. Soman and Jayadeva [171] used the classifiable metric tochoose discriminative frequency bands and used the TSVM to learn unbal-anced datasets. Tanveer et al. [182] proposed Entropy based features in Flexi-ble analytic wavelet transform (FAWT) framework and RELS-TSVM [179] forclassification to detect epileptic seizure EEG. Tanveer et al. [181] used FAWTframework for classification of seizure and seizure-free EEG signals with Hjorthparameters as features and implemented TSVM, LSTSVM and RELS-TSVM[179] for classifying signals. Li et al. [87] proposed LSTSVM with a frequencyband selection common spatial pattern algorithm for detecting motor imageryelectroencephalography. This algorithm achieved faster training time as com-pared to other SVM baseline algorithms. Some more researchers have appliedTSVM for EEG classification, biometric identification and other leak detectionchallenges [33, 75, 82, 87, 166]

Other Applications -Cao et al. [14] proposed improved TSVM with multi-objective cuckoo search topredict software bugs. Authors employed TSVM to predict defected modulesand used cuckoo search to optimize TSVM and this proposed method achievedbetter accuracy than other software defect prediction methods. Chu et al. [28]proposed Multi-information TSVM for detecting steel surface defects.

where gy(x) = g(x;wy , by) = wTy x + by = 0, D(gy(A, 0) denotes the intra-

class distance which represents objective function while D(gy(A), gy(B)) is in-terclass distance which corresponds to constraints,D(gy(x)) is the perpendicu-lar distance of point x from the hyperplane gy(x) = 0. The pinball loss function

is defined as follows: Lτ (Xy, y, gy, (Xy)) =

eT (0− ygy(Xy)), (0 − ygy(Xy)) ≥ 0,

−τeT (ygy(Xy)− 0), (0 − ygy(Xy)) < 0.

Optimization framework of various TSVM algorithms are discussed in Ta-ble 1 [86].

Page 24: arXiv:2105.00336v1 [cs.LG] 1 May 2021

24 M. Tanveer et al.

Table 1 Optimization framework for TSVM algorithms

Method Regularizationterm

D (gy (Xy) , 0) D (gy (Xj) , gy (Xy)) D (gy (x)) Distance

GEPSVM [91]

(

wy

by

)

2

||gy (Xy) ||2 ||gy (Xj) ||2 |gy(x)|

||wy ||||.||

TSVM [60] 0 ||gy (Xy) ||2 eT ||gy (Xj)− e||+ |gy (x) | ||.||/(.)+

TBSVM [161]

(

wy

by

)

2

||gy (Xy) ||2 eT ||gy (Xj)− e||+|gy(x)|

||wy ||||.||/ (.)+

LSTSVM [78] 0 ||gy (Xy) ||2 ||gy (Xj)− e||2 |gy (x) | ||.||

Pin-TSVM [231] ||wy||2 Lτ (Xy , y, gy, (Xy)) gy (Xj)1

|gy(x)|

||wy ||||.||/LT (.) /||.||1

Pin-GTSVM[185]

||wy||2 ||gy (Xy) ||2 Lτ (Xy, y, gy, (Xy))

|gy(x)|

||wy ||||.||/LT (.) /||.||1

NPSVM [194] ||wy||2 ||gy (Xy)− ǫ||1 eT (gy (Xj)− e)+ |gy (x) | ||.||/||.||1/ (.)+

Par-TSVM [111] ||wy||2 eT gy (Xj) ||gy (Xy) ||+ |gy (x) | ||.||

Table 2 Properties of different TSVM Algorithms

Models \ Characteristics SRM Sparsity Matrix Inversion Insensitive to noise

TSVM [60] X X X

TBSVM [161] X X X X

LSTSVM [78] X

RELS-TSVM [179] X X X

Projection TSVM [56] X X X

TPMSVM [111] X X X

Pin-GTSVM [185] X X

SP-TSVM [192] X X X

Pin-TPMSVM [231] X X

ISPTSVM [183] X X X X

Table 2 shows the differences in major TSVM methods based on the SRMprinciple, Sparsity, Matrix Inversion and Noise Insensitivity.

Page 25: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 25

5 Basic theory of Twin Support Vector Regression

Peng [109] in 2010 proposed an efficient twin support vector regression (TSVR)algorithm in line with TSVM, called twin support vector regresson (TSVR).Like TSVM, it also requires to solve two QPPs. It finds an end regressorthat is the mean of ε-insensitive up and bound functions. TSVR has lesscomputational time than a standard SVR and has better generalization ability.The down- and up-bound functions for linear case is given below:

For any x ∈ Rn, the two hyperplanes are defined as follows:

f1(x) = uT1 x+ b1 and f2(x) = uT

2 x+ b2, (9)

The two QPPs in linear case are defined as below:

min(u1,b1,ζ1)∈Rn+1+m

1

2||y − eε1 − (Au1 + b1e)||

2 + C1eT ζ1

s.t. y − (Au1 + b1e) ≥ eε1 − ζ1, ζ1 ≥ 0 (10)

and

min(u2,b2,ζ2)∈Rn+1+m

1

2||y + eε2 − (Au2 + b2e)||

2 + C2eT ζ2

s.t. (Au2 + b2e)− y ≥ eε2 − ζ2, ζ2 ≥ 0, (11)

here, C1, C2 > 0; ε1, ε2 > 0 and ζ1, ζ2 are slack variables. The final regressoris the mean of up and down regressors in (9), which is given as follows

f(x) =1

2(f1(x) + f2(x)) for all x ∈ Rn. (12)

For nonlinear case, kernel surfaces are used rather than hyperplanes whichare given below:

f1(x) = k(xT , AT )u1 + b1 and f2(x) = k(xT , AT )u2 + b2. (13)

The two QPPs in non-linear case are defined as below:

min(u1,b1,ζ1)∈Rm+1+m

1

2||y − eε1 − (k(A,AT )u1 + b1e)||

2 + C1eT ζ1

s.t. y − (k(A,AT )u1 + b1e) ≥ eε1e− ζ1, ζ1 ≥ 0 (14)

and

min(u2,b2,ζ2)∈Rm+1+m

1

2||y + eε2 − (k(A,At)u2 + b2e)||

2 + C2eT ζ2

s.t. (k(A,AT )u2 + b2e)− y ≥ eε2 − ζ2, ζ2 ≥ 0. (15)

For more details, one can refer to [109].

Page 26: arXiv:2105.00336v1 [cs.LG] 1 May 2021

26 M. Tanveer et al.

Although taking motivation from TSVM formulation, Peng [109] attemptedto propose TSVR where the regressor is obtained via solving a pair of quadraticprogramming problems (QPPs). However, later authors in [64] argued thatTSVR formulation is not in the true spirit of TSVM. Further, taking the mo-tivation from Bi and Bennett [11], they proposed an alternative approach tofind a formulation for TSVR which is in the true spirit of TSVM. They haveshown that their proposed TSVR formulation can be derived from TSVM foran appropriately constructed classification problem.

6 Research progress on Twin Support Vector Regression

6.1 Weighted Twin Support Vector Regression

Xu and Wang [226] introduced weighted TSVR which gives different weightscan have different influence over bound functions. Computational results havedemonstrated that this algorithm avoids over-fitting and also yields good gen-eralization ability. The authors [227] also proposed K nearest neighbor basedweighted TSVR (KNNWTSVR) which gives different penalties to differentsamples based on their local information on number of k-nearest neighbors.The weights are assigns based on number of k-nearest neighbors. KNNWTSVRhas better accuracy but similar computational complexity as its optimizationis similar to TSVR. However, the above algorithm only implement empiricalrisk minimization and suffer from the inverse of positive semi-definite matrices.To overcome these limitations, Tanveer et al. [191] introduced an efficient regu-larized KNNWTSVR (RKNNWTSVR) algorithm to make it strongly convex.RKNNWTSVR leads to better generalization and robust solution. Computa-tional cost is also reduced as it only deals with linear equations. To overcomenoise sensitivity in TSVR, Ye et al. [248] introduced weighted matrix in La-grangian ǫ twin support vector regression. It uses quadratic loss functions andprovides different weights to different samples through weighted matrix. Itobtains better generalization and also has less training time than TSVR andǫ-TSVR.

6.2 Projection Twin Support Vector Regression

TSVR [109] and twin parametric insensitive SVR [112] have obtained bettergeneralization performance than classical SVR but both these algorithms onlyimplement empirical risk minimization and do not include any prior informa-tion about the data samples which can lead to noise sensitivity. Thus, Peng etal. [120] proposed an efficient twin projection SVR (TPSVR) algorithm whichexploits the prior structural information of data into the learning process. Itseeks to determine two projection axes such that projected points have assmall as possible empirical variance values on the down-and up-bound func-tions. This algorithm has better generalization and requires small number ofsupport vectors.

Page 27: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 27

6.3 Robust and Sparse Twin Support Vector Regression

Although TSVR has proven to be an effective classifier with good generaliza-tion ability, it is less robust due to the square of the 2-norm in the QPP ofTSVR. In 2012, Zhong et al. [257] improved TSVR by using 1-norm ratherthan 2-norm distance in TSVR’s QPP. It has less training time and better gen-eralization. Chen et al. [26] formulated smooth TSVR (STSVR) using smoothfunction in order to make the QPP of TSVR positive definite to obtain a uniqueglobal solution. Authors converted the QPPs to unconstrained minimizationproblems (UMPs) and applied Newton method to solve it effectively. All thesealgorithms still have high computational time due to solving quadratic or lin-ear programming problems. To avoid this shortcoming, a LS version of TSVR(TLSSVR) was formulated by Zhao et al. [255] which leads to faster computa-tional speed as it only deals with set of linear equations. Authors also proposedsparse TLSSVR.

In 2014, Chen et al. [25] introduced the regularization into the formula-tion of TSVR and implemented L1-norm loss function to make it robust andsparse simultaneously. Huang et al. [58] formulated a sparse version of leastsquare TSVR by introducing a regularization term to make it strongly convexand also converted the primal problems to linear programming problems. Thisleads to a sparse solution with significantly less computational time. In 2017,Tanveer [177] formulated 1-norm TSVR to improve robustness and sparsityin original TSVR. 1-norm TSVR has better accuracy, generalization, and lesscomputational time than TSVR. In 2020, Singla et al. [169] proposes a novelTSVR (Res-TSVR) which is robust and not sensitive to noise in data. Theoptimization problem is non-convex with smooth l2 regularizer and thus, tosolve it efficiently, the authors converted it to a convex optimization problem.Res-TSVR performed best as compared to other robust TSVR algorithms interms of robustness to noise and better generalization. Gu et al. [47] also pro-posed a TSVR variant suitable to handle noise called fast clustering-basedweighted TSVR (FCWTSVR) which classify the samples into different cat-egories based on their similarities and provides different weights to sampleslocated at different positions. The proposed algorithm performed better thanTSVR, ε-TSVR, KNNWTSVR and WL-ε-TSVR.

6.4 Other improvements on Twin Support Vector Regression

TSVR has proven to provide better generalization results but it needs to solvetwo QPPs which increases the learning time for TSVR. Thus, Peng [108] for-mulated a primal TSVR (P-TSVR), which only deals with linear equations.This improves the learning speed of TSVR and shows comparable generaliza-tion. Further, to increase the sparsity of TSVR, the author introduced theback-fitting strategy for optimizing the unconstrained QPP. TSVR requirestwo set of constraints one with each QPP which increases the computationaltime for large datasets. To overcome this disadvantage, Singh et al. [168] intro-

Page 28: arXiv:2105.00336v1 [cs.LG] 1 May 2021

28 M. Tanveer et al.

duced rectangular kernels in the formulation of TSVR and proposed reducedTSVR which resulted in the significant saving of computational time and thuspromoting its application for large datasets. To further reduce computationaltime, a LS version of TSVR (TLSSVR) was formulated by Zhao et al. [255]which only deals with a set of linear equations. Authors also proposed sparseTLSSVR.

Huang and Ding [57] further attempted to reduce the computational timeby proposing LS-TSVM in primal space (TLSSVR) rather than dual space(PLSTSVR). This also requires to find solution of two linear equations andhas comparable accuracy to TSVR. To make TSVR suitable to handle het-eroscedastic noise structure, Peng [112] proposed twin parametric insensitiveSVR (TPISVR) which determines a set of nonparallel parametric-insensitiveup and down-bound functions. It also works effectively when noise dependsupon the input value. It requires to solve two SVM-type problems, whichincreases its learning speed. Computational results showed that it also hasgood generalization ability. Shao et al. [162] implemented the SRM principlein TSVR primal space and proposed a new regressor ǫ–TSVR which seeks tofind a pair of ǫ-insensitive proximal functions. Further, to reduce complex-ity, the successive over-relaxation (SOR) technique is employed. Experimentalresults show that ǫ−TSVR has better generalization and fast training speedthan TSVR.

Balasundaram and Tanveer [8] proposed linearly convergent LagrangianTSVR (LTSVR) algorithm. Experimental results have exhibited its suitabil-ity and applicability due to its better generalization and less computationaltime than TSVR. Inspired by this algorithm, Balasundaram and Gupta [6]proposed Lagrangian dual of the 2-norm TSVR. Results have demonstratedan increase in learning speed with better accuracy in performance when com-pared to TSVR. Tanveer et al. [191] introduced the regularization term tothe objective function of TSVR and formulated regularized Lagrangian TSVR(IRLTSVR). This algorithm implements the SRM principle and requires tosolve linear equations in place of QPP in TSVR. Optimization problems arepositive definite and avoid the singularity in the solution. It has better accu-racy and speed than conventional TSVR.

Balasundaram and Tanveer [9] proposed smooth Newton method for LTSVRwhich needs to solve linear equations in each iteration using Newton-Armijoalgorithm. It has comparable generalization ability but it is at least two timesfaster than TSVR. Khemchandani et al. [66] proposed TSVR for simultaneouslearning. This algorithm is more accurate, computationally less complex andmore robust as it uses L1 norm error.

Peng et al. [113] implemented the use of interval data to handle intervalinput-output data (ITSVR). Rastogi et al. [127, 128] provided a extension ofν-SVR i.e ν-TWSVR and large margin distribution machine based regressionthat it is in the true spirit of TSVM. Balasundaram and Meena [7] proposedunconstrained TSVR formulation in the primal space (UP-TSVR) which isspeed and obtains better generalization than TSVR. This model is solved bya gradient based iterative approach.

Page 29: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 29

Parastalooi et al. [105] proposed a modified version of TSVR by includ-ing the structural information from data and its distribution. Clustering isdone based on the mean and covariance matrix of the data which increasesaccuracy. Furthermore, to increase the training speed, authors applied SORtechnique and also optimized parameter selection by implementing PSO al-gorithm. Rastogi et al. [129] proposed an improved version of ν-TSVR whichcan automatically adjust the values of upper and lower bound parameters toattain better accuracy. Experimental results have shown the superiority of theproposed algorithm over ǫ-TSVR.

TSVR gives same weights to all the samples but in fact, different positionswill influence differently on the regressor which are ignored in TSVR. Thus,Xu et al. [223] proposed asymmetric ν-TSVR based on pinball loss function.Pinball loss function gives different penalties to the points lying above andbelow the bounds. It is insensitive to noise and also has better generalizationability. Tanveer and Shubham [189] added regularization term in TSVR in theprimal form which yields better accuracy and more robust solution.

7 Applications of Twin Support Vector Regression

Ye et al. [249] implemented L1 − ǫ- TSVR for forecasting inflation. This algo-rithm proved to be excellent for feature ranking and determined the importantfeatures for inflation in China. Experimental results showed that its accuracy isbetter than the ordinary least square (OLS) models. Ye et al. [247] also imple-mented ǫ-wavelet TSVR for inflation forecast. Authors employed the waveletkernel that can be used for any curve in quadratic continuous integral space.This algorithm derives lower root mean squared error (RMSE) and thus, is anefficient method for inflation forecast. Ding et al. [35] predicted stock prices us-ing polynomial smooth twin support vector regression. Numerical experimentsreveal that this algorithm can obtain better regression performance comparedwith SVR and TSVR. Le et al. [83] proposed a novel genetic algorithm (GA)based TSVR to improve the precision of indoor positioning. It performs betterthan K- nearest neighbor and neural network but comparable to SVR withsignificantly less computational time. Houssein [52] proposed particle swarmoptimization (PSO) based TSVR for forecasting wind speed. The computa-tional results proved that the proposed approach achieved better forecastingaccuracy and outperformed genetic algorithm with TSVR and TSVR usingradial basis kernel function. In 2018, Wang and Xu [207] proposed safe screen-ing rule (SSR) based on variational inequality (VI) to make TSVR efficientfor large-scale problems as SSR reduces computational time significantly. Au-thors also implemented dual coordinate descent method (DCDM) to furtherincrease the computational speed of TSVR.

Table 3 shows the differences in major TSVR methods based on the SRMprinciple, Sparsity, Matrix Inversion and Noise Insensitivity.

Page 30: arXiv:2105.00336v1 [cs.LG] 1 May 2021

30 M. Tanveer et al.

Table 3 Properties of different TSVR Algorithms

Models \ Characteristics SRM Sparsity Matrix Inversion Insensitive to Noise

TSVR [109] X X

TWSVR [64] X X X

ǫ−TSVR [162] X X X

WTSVR [226] X X

LTSVR [8] X X

RLTSVR [189] X X X

PTSVR [108] X X X X

KNNWTSVR [227] X X

RKNNWTSVR [191] X X X

8 Future research and development prospects

TSVM classification algorithms have high training speed and accuracy thanconventional SVM but it’s still in the primitive stage of development and lackspractical application background. TSVM has low generalization ability andalso lacks in sparsity. Therefore, TSVM needs further research and improve-ments to effectively apply to real-life challenges. Future research prospects forTSVM can be as follows :

– An interesting aspect can be coupling other machine learning algorithmswith the TSVM. For example, a deep convolutional neural network canextract features which can be classified using TSVM with better accuracy.

– There is limited research on TSVM applications for large-scale classifica-tion. Thus, how to develop TSVM algorithms for big data classificationproblems effectively is worthwhile.

– For non-linear classification problems, TSVM performance highly dependsupon kernel function selection and there is not enough research on thisto guide researchers to choose kernels as per different applications to getdesired accuracy and performance of the TSVM algorithm. Thus, kernelselection and optimal parameters selection need further study and improve-ment.

– Currently, only a few TSVM algorithms have been implemented for multi-class classification but it leads to class imbalanced problem and often lossessparsity. Thus, further study is required for TSVM implementation formulti-class classification.

Page 31: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 31

– The main concept of GEPSVM/TSVM is based on linear discriminantanalysis (LDA). A well cross study on TSVM and LDA is worthy of futurework.

– TSVM applications to health care is currently limited. Thus, how to imple-ment TSVM effectively for early diagnosis of human diseases like Alzheimer,Epilepsy, Breast cancer etc is worthy of study.

– Clustering, which aims at dividing the data samples into different clusters,is major fundamental problem in classification. Clustering based TSVMapproach is less studied currently and needs further study and develop-ment.

– There is limited research on TSVM applications for remote-sensing. Thus,how to build efficient classifiers for remote-sensing can be explored.

TSVR is much faster than conventional SVR and also has better gener-alization ability. But, it suffers from a lack of model complexity control andresults in over-fitting. It also losses sparsity similar to TSVM and is sensitiveto outliers. Further research on TSVR can be on the following:

– A significant limitation of TSVR is high computational time as it loses thesparsity. More work is required to find an efficient sparse TSVR algorithm.

– Data cleaning, transforming and pre-processing is an important issue forevery machine learning technique and can tremendously improve resultsand even help in identifying novel interactions within data. As TSVR isa relatively new technique, various data handling, cleaning, pre-processingtechniques like missing value imputation can be explored for improving theperformance of TSVR.

– TSVR is evaluated on only a few types of continuous variable problems.Application of TSVR can be explored on a wide range of problems indifferent domains.

– The current TSVR requires off-setting of multiple hyperparameters andhence optimal parameter selection is an issue. Thus, further research inthe direction of identifying and choosing parameters should be done.

Acknowledgements This work is supported by Science and Engineering Research Board(SERB) funded Research Projects, Government of India under Early Career Research AwardScheme, Grant No. ECR/2017/000053 and Ramanujan Fellowship Scheme, Grant No. SB/S2/RJN-001/2016, and Council of Scientific & Industrial Research (CSIR), New Delhi, INDIA underExtra Mural Research (EMR) Scheme grant no. 22(0751)/17/EMR-II. We gratefully ac-knowledge the Indian Institute of Technology Indore for providing facilities and support.Yuan-Hai Shao acknowledges the financial support by the National Natural Science Foun-dation of China (11871183, 61866010, 11926349).

Conflict of Interest: The authors declare that they have no conflict of in-terest.

References

1. Agarwal, S., Tomar, D., Siddhant: Prediction of software defects usingtwin support vector machine. In: 2014 International Conference on Infor-

Page 32: arXiv:2105.00336v1 [cs.LG] 1 May 2021

32 M. Tanveer et al.

mation Systems and Computer Networks (ISCON), pp. 128–132. IEEE(2014)

2. Ai, Q., Wang, A., Zhang, A., Wang, Y., Sun, H.: A multi-class classifica-tion weighted least squares twin support vector hypersphere using localdensity information. IEEE Access 6, 17284–17291 (2018)

3. Alam, S., Kwon, G.R., Kim, J.I., Park, C.S.: Twin SVM-based classi-fication of Alzheimer disease using complex dual-tree wavelet principalcoefficients and LDA. Journal of healthcare engineering (2017)

4. Azad-Manjiri, M., Amiri, A., Sedghpour, A.S.: Ml-slstsvm: A new struc-tural least square twin support vector machine for multi-label learning.Pattern Analysis and Applications 23(1), 295–308 (2020)

5. Bai, L., Shao, Y.H., Wang, Z., Li, C.N.: Clustering by twin support vec-tor machine and least square twin support vector classifier with uniformoutput coding. Knowledge-Based Systems 163, 227–240 (2019)

6. Balasundaram, S., Gupta, D.: Training Lagrangian twin support vec-tor regression via unconstrained convex minimization. Knowledge-BasedSystems 59, 85–96 (2014)

7. Balasundaram, S., Meena, Y.: Training primal twin support vector re-gression via unconstrained convex minimization. Applied Intelligence44(4), 931–955 (2016)

8. Balasundaram, S., Tanveer, M.: On Lagrangian twin support vector re-gression. Neural Computing and Applications 22(1), 257–267 (2013)

9. Balasundaram, S., Tanveer, M.: Smooth newton method for implicitLagrangian twin support vector regression. International Journal ofKnowledge-based and Intelligent Engineering Systems 17(4), 267–278(2013)

10. Basak, D., Pal, S., Patranabis, D.C.: Support vector regression. NeuralInformation Processing-Letters and Reviews 11(10), 203–224 (2007)

11. Bi, J., Bennett, K.P.: A geometric approach to support vector regression.Neurocomputing 55, 79–108 (2003)

12. Byvatov, E., Schneider, G.: Support vector machine applications in bioin-formatics. Applied Bioinformatics 2(2), 67–77 (2003)

13. Cao, L., Shen, H.: Combining re-sampling with twin support vector ma-chine for imbalanced data classification. In: 2016 17th International Con-ference on Parallel and Distributed Computing, Applications and Tech-nologies (PDCAT), pp. 325–329. IEEE (2016)

14. Cao, Y., Ding, Z., Xue, F., Rong, X.: An improved twin support vec-tor machine based on multi-objective cuckoo search for software defectprediction. International Journal of Bio-Inspired Computation 11(4),282–291 (2018)

15. Chandra, M.A., Bedi, S.: A twin support vector machine based approachto classifying human skin. In: 2018 4th International Conference on Com-puting Communication and Automation (ICCCA), pp. 1–5. IEEE (2018)

16. Chang, C.C., Lin, C.J.: Training ν-support vector regression: theory andalgorithms. Neural Computation 14(8), 1959–1977 (2002)

Page 33: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 33

17. Chen, J., Ji, G.: Weighted least squares twin support vector machinesfor pattern classification. In: Computer and Automation Engineering(ICCAE), 2010 The 2nd International Conference on, vol. 2, pp. 242–246. IEEE (2010)

18. Chen, P.H., Lin, C.J., Scholkopf, B.: A tutorial on ν-support vector ma-chines. Applied Stochastic Models in Business and Industry 21(2), 111–136 (2005)

19. Chen, S., Cao, J., Chen, F., Liu, B.: Entropy-based fuzzy least squarestwin support vector machine for pattern classification. Neural ProcessingLetters 51(1), 41–66 (2020)

20. Chen, S.G., Wu, X.J.: A new fuzzy twin support vector machine forpattern classification. International Journal of Machine Learning andCybernetics 9(9), 1553–1564 (2018)

21. Chen, W.J., Shao, Y.H., Deng, N.Y., Feng, Z.L.: Laplacian least squarestwin support vector machine for semi-supervised classification. Neuro-computing 145, 465–476 (2014)

22. Chen, W.J., Shao, Y.H., Hong, N.: Laplacian smooth twin support vec-tor machine for semi-supervised classification. International Journal ofMachine Learning and Cybernetics 5(3), 459–468 (2014)

23. Chen, W.J., Shao, Y.H., Li, C.N., Deng, N.Y.: MLTSVM: a novel twinsupport vector machine to multi-label learning. Pattern Recognition 52,61–74 (2016)

24. Chen, W.J., Shao, Y.H., Li, C.N., Wang, Y.Q., Liu, M.Z., Wang, Z.:Nprsvm: Nonparallel sparse projection support vector machine with effi-cient algorithm. Applied Soft Computing 90, 106142 (2020)

25. Chen, X., Yang, J., Chen, L.: An improved robust and sparse twin sup-port vector regression via linear programming. Soft Computing 18(12),2335–2348 (2014)

26. Chen, X., Yang, J., Liang, J., Ye, Q.: Smooth twin support vector regres-sion. Neural Computing and Applications 21(3), 505–513 (2012)

27. Chen, X., Yang, J., Ye, Q., Liang, J.: Recursive projection twin supportvector machine via within-class variance minimization. Pattern Recogni-tion 44(10-11), 2643–2655 (2011)

28. Chu, M., Liu, X., Gong, R., Liu, L.: Multi-class classification method us-ing twin support vector machines with multi-information for steel surfacedefects. Chemometrics and Intelligent Laboratory Systems 176, 108–118(2018)

29. Chuang, C.C.: Fuzzy weighted support vector regression with a fuzzypartition. IEEE Transactions on Systems, Man, and Cybernetics, PartB (Cybernetics) 37(3), 630–640 (2007)

30. Cong, H., Yang, C., Pu, X.: Efficient speaker recognition based on multi-class twin support vector machines and GMMs. In: 2008 IEEE conferenceon robotics, automation and mechatronics, pp. 348–352. IEEE (2008)

31. Cortes, C., Vapnik, V.: Support vector networks. Machine learning 20(3),273–297 (1995)

Page 34: arXiv:2105.00336v1 [cs.LG] 1 May 2021

34 M. Tanveer et al.

32. Cristianini, N., Shawe-Taylor, J.: An introduction to support vector ma-chines and other kernel-based learning methods. Cambridge universitypress (2000)

33. Dalal, M., Tanveer, M., Pachori, R.B.: Automated identification systemfor focal eeg signals using fractal dimension of fawt-based sub-bands sig-nals. In: Machine Intelligence and Signal Analysis, pp. 583–596. Springer(2019)

34. Ding, S., Hua, X.: Recursive least squares projection twin support vectormachines for nonlinear classification. Neurocomputing 130, 3–9 (2014)

35. Ding, S., Huang, H., Nie, R.: Forecasting method of stock price basedon polynomial smooth twin support vector regression. In: InternationalConference on Intelligent Computing, pp. 96–105. Springer (2013)

36. Ding, S., Zhang, X., Yu, J.: Twin support vector machines based on fruitfly optimization algorithm. International Journal of Machine Learningand Cybernetics 7(2), 193–203 (2016)

37. Ding, S., Zhao, X., Zhang, J., Zhang, X., Xue, Y.: A review on multi-classTWSVM. Artificial Intelligence Review pp. 1–27 (2017)

38. Ding, X., Zhang, G., Ke, Y., Ma, B., Li, Z.: High efficient intrusion de-tection methodology with twin support vector machines. In: 2008 Inter-national Symposium on Information Science and Engineering, vol. 1, pp.560–564. IEEE (2008)

39. Drucker, H., Burges, C.J., Kaufman, L., Smola, A.J., Vapnik, V.: Sup-port vector regression machines. In: Advances in Neural InformationProcessing Systems, pp. 155–161 (1997)

40. Elattar, E.E., Goulermas, J., Wu, Q.H.: Electric load forecasting basedon locally weighted support vector regression. IEEE Transactions on Sys-tems, Man, and Cybernetics, Part C (Applications and Reviews) 40(4),438–447 (2010)

41. Francis, L.M., Sreenath, N.: Robust scene text recognition: Using man-ifold regularized twin-support vector machine. Journal of King SaudUniversity-Computer and Information Sciences (2019)

42. Ganaie, M., Tanveer, M.: Lstsvm classifier with enhanced features frompre-trained functional link network. Applied Soft Computing p. 106305(2020)

43. Ganaie, M., Tanveer, M., Suganthan, P.: Oblique decision tree ensemblevia twin bounded svm. Expert Systems with Applications 143, 113072(2020)

44. Ganaie, M., Tanveer, M., Suganthan, P.: Regularized robust fuzzy leastsquares twin support vector machine for class imbalance learning. In:2020 International Joint Conference on Neural Networks, IJCNN, pp.1–8. IEEE (2020)

45. Gao, S., Ye, Q., Ye, N.: 1-norm least squares twin support vector ma-chines. Neurocomputing 74(17), 3590–3597 (2011)

46. Ghorai, S., Hossian, S.J., Mukherjee, A., Dutta, P.K.: Unity norm twinsupport vector machine classifier. In: 2010 Annual IEEE India Conference(INDICON), pp. 1–4. IEEE (2010)

Page 35: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 35

47. Gu, B., Fang, J., Pan, F., Bai, Z.: Fast clustering-based weighted twinsupport vector regression. Soft Computing pp. 1–17 (2020)

48. Guo, J., Yi, P., Wang, R., Ye, Q., Zhao, C.: Feature selection for leastsquares projection twin support vector machine. Neurocomputing 144,174–183 (2014)

49. Gupta, D., Richhariya, B., Borah, P.: A fuzzy twin support vector ma-chine based on information entropy for class imbalance learning. NeuralComputing and Applications 31(11), 7153–7164 (2019)

50. Hao, P.Y.: New support vector algorithms with parametric insensi-tive/margin model. Neural Networks 23(1), 60–73 (2010)

51. He, J., Zheng, S.h.: Intrusion detection model with twin support vectormachines. Journal of Shanghai Jiaotong University (Science) 19(4), 448–454 (2014)

52. Houssein, E.H.: Particle swarm optimization-enhanced twin support vec-tor regression for wind speed forecasting. Journal of Intelligent Systems(2017)

53. Houssein, E.H., Ewees, A.A., ElAziz, M.A.: Improving twin support vec-tor machine based on hybrid swarm optimizer for heartbeat classification.Pattern Recognition and Image Analysis 28(2), 243–253 (2018)

54. Hua, S., Sun, Z.: Support vector machine approach for protein subcellularlocalization prediction. Bioinformatics 17(8), 721–728 (2001)

55. Hua, X., Ding, S.: Weighted least squares projection twin support vectormachines with local information. Neurocomputing 160, 228–237 (2015)

56. Hua, X., Xu, S., Gao, J.: A novel projection twin support vector ma-chine for pattern recognition. In: Smart Cities Conference (ISC2), 2017International, pp. 1–4. IEEE (2017)

57. juan Huang, H., Ding, S.: Primal least squares twin support vector re-gression. Journal of Zhejiang University SCIENCE C 14(9), 722–732(2013)

58. Huang, H., Wei, X., Zhou, Y.: A sparse method for least squares twinsupport vector regression. Neurocomputing 211, 150–158 (2016)

59. Huang, H., Wei, X., Zhou, Y.: Twin support vector machines: A survey.Neurocomputing 300, 34–43 (2018)

60. Jayadeva, Khemchandani, R., Chandra, S.: Twin support vector ma-chines for pattern classification. IEEE Transactions on Pattern Analysisand Machine Intelligence 29(5), 905–910 (2007)

61. Jayadeva, Khemchandani, R., Chandra, S.: Fuzzy twin support vectormachines for pattern classification. In: Mathematical programming andgame theory for decision making, pp. 131–142. World Scientific (2008)

62. Jayadeva, R.K., Chandra, S.: Twin Support Vector Machines: Models,Extension and Applications, vol. 659. Springer (2016)

63. Khan, R., Tanveer, M., Pachori, R., (ADNI), A.D.N.I.: A novel methodfor the classification of alzheimer’s disease from normal controls usingmri. Expert Systems https://doi.org/10.1111/exsy.12566 (2020)

64. Khemchandani, R., Goyal, K., Chandra, S.: TWSVR: regression via twinsupport vector machine. Neural Networks 74, 14–21 (2016)

Page 36: arXiv:2105.00336v1 [cs.LG] 1 May 2021

36 M. Tanveer et al.

65. Khemchandani, R., Jayadeva, Chandra, S.: Optimal kernel selection intwin support vector machines. Optimization Letters 3(1), 77–88 (2009)

66. Khemchandani, R., Karpatne, A., Chandra, S.: Twin support vector re-gression for the simultaneous learning of a function and its derivatives.International Journal of Machine Learning and Cybernetics 4(1), 51–63(2013)

67. Khemchandani, R., Pal, A.: Multi-category Laplacian least squares twinsupport vector machine. Applied Intelligence 45(2), 458–474 (2016)

68. Khemchandani, R., Pal, A.: Weighted linear loss twin support vectorclustering. In: Proceedings of the 3rd IKDD Conference on Data Science,2016, p. 18. ACM (2016)

69. Khemchandani, R., Pal, A.: Tree based multi-category LaplacianTWSVM for content based image retrieval. International Journal of Ma-chine Learning and Cybernetics 8(4), 1197–1210 (2017)

70. Khemchandani, R., Pal, A., Chandra, S.: Fuzzy least squares twin sup-port vector clustering. Neural computing and applications 29(2), 553–563(2018)

71. Khemchandani, R., Saigal, P., Chandra, S.: Improvements on ν-twin sup-port vector machine. Neural Networks 79, 97–107 (2016)

72. Khemchandani, R., Saigal, P., Chandra, S.: Angle-based twin supportvector machines. Annals of OR 269, 387–417 (2018)

73. Khemchandani, R., Sharma, S.: Robust least squares twin support vectormachine for human activity recognition. Applied Soft Computing 47, 33–46 (2016)

74. Khemchandani, R., Sharma, S.: Robust parametric twin support vectormachine and its application in human activity recognition. In: Proceed-ings of International Conference on Computer Vision and Image Process-ing, pp. 193–203. Springer (2017)

75. Kostılek, M., St’astny, J.: EEG biometric identification: repeatability andinfluence of movement-related EEG. In: 2012 International Conferenceon Applied Electronics, pp. 147–150. IEEE (2012)

76. Kuhn, H.W., Tucker, A.W.: Nonlinear programming. Proceedings of 2ndBerkeley Symposium. Berkeley: University of California Press pp. 481–492 (1951)

77. Kumar, M.A., Gopal, M.: Application of smoothing technique on twinsupport vector machines. Pattern Recognition Letters 29(13), 1842–1848(2008)

78. Kumar, M.A., Gopal, M.: Least squares twin support vector machines forpattern classification. Expert Systems with Applications 36(4), 7535–7543 (2009)

79. Kumar, M.A., Khemchandani, R., Gopal, M., Chandra, S.: Knowledgebased least squares twin support vector machines. Information Sciences180(23), 4606–4618 (2010)

80. Kumar, M.P., Rajagopal, M.K.: Detecting happiness in human face us-ing unsupervised twin-support vector machines. International Journal ofIntelligent Systems and Applications 10(8), 85 (2018)

Page 37: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 37

81. Kumar, M.P., Rajagopal, M.K.: Detecting facial emotions using normal-ized minimal feature vectors and semi-supervised twin support vectormachines classifier. Applied Intelligence pp. 1–25 (2019)

82. Lang, X., Li, P., Hu, Z., Ren, H., Li, Y.: Leak detection and location ofpipelines based on LMD and least squares twin support vector machine.IEEE Access 5, 8659–8668 (2017)

83. Le, W., Wang, Z., Wang, J., Zhao, G., Miao, H.: A novel wifi indoorpositioning method based on genetic algorithm and twin support vectorregression. In: The 26th Chinese Control and Decision Conference (2014CCDC), pp. 4859–4862. IEEE (2014)

84. Li, C.N., Huang, Y.F., Wu, H.J., Shao, Y.H., Yang, Z.M.: Multiple recur-sive projection twin support vector machine for multi-class classification.International Journal of Machine Learning and Cybernetics 7(5), 729–740 (2016)

85. Li, C.N., Ren, P.W., Shao, Y.H., Ye, Y.F., Guo, Y.R.: Generalized elasticnet lp-norm nonparallel support vector machine. Engineering Applica-tions of Artificial Intelligence 88, 103397 (2020)

86. Li, C.N., Shao, Y.H., Wang, H., Zhao, Y.T., Huang, L.W., Xiu, N., Deng,N.Y.: Single versus union: Non-parallel support vector machine frame-works. arXiv preprint arXiv:1910.09734 (2019)

87. Li, D., Zhang, H., Khan, M.S., Mi, F.: A self-adaptive frequency selectioncommon spatial pattern and least squares twin support vector machinefor motor imagery electroencephalography recognition. Biomedical SignalProcessing and Control 41, 222–232 (2018)

88. de Lima, M.D., Costa, N.L., Barbosa, R.: Improvements on least squarestwin multi-class classification support vector machine. Neurocomputing313, 196–205 (2018)

89. Liu, D., Shi, Y., Tian, Y.: Ramp loss nonparallel support vector machinefor pattern classification. Knowledge-Based Systems 85, 224–233 (2015)

90. Mangasarian, O.L., Wild, E.W.: Proximal support vector machine classi-fiers. In: Proceedings KDD-2001: Knowledge Discovery and Data Mining.Citeseer (2001)

91. Mangasarian, O.L., Wild, E.W.: Multisurface proximal support vectormachine classification via generalized eigenvalues. IEEE Transactions onPattern Analysis and Machine Intelligence 28(1), 69–74 (2006)

92. Mei, B., Xu, Y.: Multi-task least squares twin support vector machinefor classification. Neurocomputing (2019)

93. Morra, J.H., Tu, Z., Apostolova, L.G., Green, A.E., Toga, A.W., Thomp-son, P.M.: Comparison of adaboost and support vector machines for de-tecting Alzheimer disease through automated hippocampal segmenta-tion. IEEE Transactions on Medical Imaging 29(1), 30–43 (2010)

94. Mousavi, A., Ghidary, S.S., Karimi, Z.: Semi-supervised intrusion detec-tion via online Laplacian twin support vector machine. In: 2015 Sig-nal Processing and Intelligent Systems Conference (SPIS), pp. 138–142.IEEE (2015)

Page 38: arXiv:2105.00336v1 [cs.LG] 1 May 2021

38 M. Tanveer et al.

95. Mozafari, K., Nasiri, J.A., Charkari, N.M., Jalili, S.: Action recognitionby local space-time features and least square Twin SVM (LS-TSVM).In: 2011 first international conference on informatics and computationalintelligence, pp. 287–292. IEEE (2011)

96. Nasiri, J.A., Charkari, N.M., Jalili, S.: Least squares twin multi-classclassification support vector machine. Pattern Recognition 48(3), 984–992 (2015)

97. Nasiri, J.A., Charkari, N.M., Mozafari, K.: Energy-based model of leastsquares twin support vector machines for human action recognition. Sig-nal Processing 104, 248–257 (2014)

98. Nie, P., Zang, L., Liu, L.: Application of multi-class classification al-gorithm based on twin support vector machine in intrusion detection.Jisuanji Yingyong/ Journal of Computer Applications 33(2), 426–429(2013)

99. Noble, W.S.: Support vector machine applications in computational bi-ology. Kernel methods in computational biology 71, 92 (2004)

100. Osuna, E., Freund, R., Girosi, F.: Training support vector machines: anapplication to face detection. In: CVPR, p. 130. IEEE (1997)

101. Pan, X., Luo, Y., Xu, Y.: K-nearest neighbor based structural twin sup-port vector machine. Knowledge-Based Systems 88, 34–44 (2015)

102. Pan, X., Yang, Z., Xu, Y., Wang, L.: Safe screening rules for acceleratingtwin support vector machine classification. IEEE transactions on neuralnetworks and learning systems 29(5), 1876–1887 (2017)

103. Pang, X., Xu, C., Xu, Y.: Scaling knn multi-class twin support vectormachine via safe instance reduction. Knowledge-Based Systems 148,17–30 (2018)

104. Pang, X., Xu, Y.: A safe screening rule for accelerating weighted twinsupport vector machine. Soft Computing 23(17), 7725–7739 (2019)

105. Parastalooi, N., Amiri, A., Aliheidari, P.: Modified twin support vectorregression. Neurocomputing 211, 84–97 (2016)

106. Peng, X.: Least squares twin support vector hypersphere (LS-TSVH) forpattern recognition. Expert Systems with Applications 37(12), 8371–8378 (2010)

107. Peng, X.: A ν-twin support vector machine (ν-TSVM) classifier and itsgeometric algorithms. Information Sciences 180(20), 3863–3875 (2010)

108. Peng, X.: Primal twin support vector regression and its sparse approxi-mation. Neurocomputing 73(16-18), 2846–2858 (2010)

109. Peng, X.: TSVR: an efficient twin support vector machine for regression.Neural Networks 23(3), 365–372 (2010)

110. Peng, X.: Building sparse twin support vector machine classifiers in pri-mal space. Information Sciences 181(18), 3967–3980 (2011)

111. Peng, X.: TPMSVM: a novel twin parametric-margin support vector ma-chine for pattern recognition. Pattern Recognition 44(10-11), 2678–2692(2011)

112. Peng, X.: Efficient twin parametric insensitive support vector regressionmodel. Neurocomputing 79, 26–38 (2012)

Page 39: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 39

113. Peng, X., Chen, D., Kong, L., Xu, D.: Interval twin support vector re-gression algorithm for interval input-output data. International Journalof Machine Learning and Cybernetics 6(5), 719–732 (2015)

114. Peng, X., Kong, L., Chen, D.: Improvements on twin parametric-marginsupport vector machine. Neurocomputing 151, 857–863 (2015)

115. Peng, X., Wang, Y., Xu, D.: Structural twin parametric-margin supportvector machine for binary classification. Knowledge-Based Systems 49,63–72 (2013)

116. Peng, X., Xu, D.: Twin mahalanobis distance-based support vector ma-chines for pattern recognition. Information Sciences 200, 22–37 (2012)

117. Peng, X., Xu, D.: Bi-density twin support vector machines for patternrecognition. Neurocomputing 99, 134–143 (2013)

118. Peng, X., Xu, D.: Robust minimum class variance twin support vectormachine classifier. Neural Computing and Applications 22(5), 999–1011(2013)

119. Peng, X., Xu, D.: A twin-hypersphere support vector machine classifierand the fast learning algorithm. Information Sciences 221, 12–27 (2013)

120. Peng, X., Xu, D., Shen, J.: A twin projection support vector machine fordata regression. Neurocomputing 138, 131–141 (2014)

121. Qi, Z., Tian, Y., Shi, Y.: Laplacian twin support vector machine for semi-supervised classification. Neural Networks 35, 46–53 (2012)

122. Qi, Z., Tian, Y., Shi, Y.: Twin support vector machine with universumdata. Neural Networks 36, 112–119 (2012)

123. Qi, Z., Tian, Y., Shi, Y.: Robust twin support vector machine for patternclassification. Pattern Recognition 46(1), 305–316 (2013)

124. Qi, Z., Tian, Y., Shi, Y.: Structural twin support vector machine forclassification. Knowledge-Based Systems 43, 74–81 (2013)

125. Qi, Z., Tian, Y., Shi, Y.: A nonparallel support vector machine for a clas-sification problem with universum learning. Journal of Computationaland Applied Mathematics 263, 288–298 (2014)

126. Qiang, W., Zhang, J., Zhen, L., Jing, L.: Robust weighted linear loss twinmulti-class support vector regression for large-scale classification. SignalProcessing 170, 107449 (2020)

127. Rastogi, R., Anand, Pritam, C.S.: ν-norm twin support vector machine-based regression. Optimization 66(11), 1895–1911 (2017)

128. Rastogi, R., Anand, Pritam, C.S.: Large-margin distribution machine-based regression. Neural Computing and Applications 3, 1–16 (2018)

129. Rastogi, R., Anand, P., Chandra, S.: A ν-twin support vector machinebased regression with automatic accuracy control. Applied Intelligence46(3), 670–683 (2017)

130. Rastogi, R., Pal, A.: Fuzzy semi-supervised weighted linear loss twin sup-port vector clustering. Knowledge-Based Systems 165, 132–148 (2019)

131. Rastogi, R., Pal, A., Chandra, S.: Generalized pinball loss SVMs. Neu-rocomputing 322, 151–165 (2018)

132. Rastogi, R., Saigal, P.: Tree-based localized fuzzy twin support vectorclustering with square loss function. Appl. Intelligence 47(1), 96–113

Page 40: arXiv:2105.00336v1 [cs.LG] 1 May 2021

40 M. Tanveer et al.

(2017)133. Rastogi, R., Saigal, P., Chandra, S.: Angle-based twin parametric-margin

support vector machine for pattern classification. Knowledge-Based Sys-tems 139, 64–77 (2018)

134. Rastogi, R., Sharma, S.: Fast Laplacian twin support vector machinewith active learning for pattern classification. Applied Soft Computing74, 424–439 (2019)

135. Rastogi, R., Sharma, S., Chandra, S.: Robust parametric twin supportvector machine for pattern classification. Neural Processing Letters47(1), 293–323 (2018)

136. Refahi, M.S., Nasiri, J.A., Ahadi, S.: Ecg arrhythmia classification usingleast squares twin support vector machines. In: Electrical Engineering(ICEE), Iranian Conference on, pp. 1619–1623. IEEE (2018)

137. Rezvani, S., Wang, X., Pourpanah, F.: Intuitionistic fuzzy twin supportvector machines. IEEE Transactions on Fuzzy Systems (2019)

138. Richhariya, B., Sharma, A., Tanveer, M.: Improved universum twin sup-port vector machine. In: 2018 IEEE Symposium Series on ComputationalIntelligence (SSCI), pp. 2045–2052. IEEE (2018)

139. Richhariya, B., Tanveer, M.: Eeg signal classification using universumsupport vector machine. Expert Systems with Applications 106, 169–182 (2018)

140. Richhariya, B., Tanveer, M.: A robust fuzzy least squares twin supportvector machine for class imbalance learning. Applied Soft Computing 71,418–432 (2018)

141. Richhariya, B., Tanveer, M.: A fuzzy universum support vector machinebased on information entropy. In: Machine Intelligence and Signal Anal-ysis, pp. 569–582. Springer (2019)

142. Richhariya, B., Tanveer, M.: An efficient angle based universum leastsquares twin support vector machine for classification. ACM Transactionson Internet Technology (TOIT) https://doi.org/10.1145/3387131(2020)

143. Richhariya, B., Tanveer, M.: A reduced universum twin support vectormachine for class imbalance learning. Pattern Recognition 102, 107150(2020)

144. Richhariya, B., Tanveer, M.: Universum least squares twin parametricmargin support vector machine. 2020 International Joint Conference onNeural Networks (IJCNN) pp. 1–8 (2020)

145. Richhariya, B., Tanveer, M., Initiative, A.D.N.: Least squares projec-tion twin support vector clustering (lsptsvc). Information Scienceshttps://doi.org/10.1016/j.ins.2020.05.001 (2020)

146. Richhariya, B., Tanveer, M., Rashid, A., Initiative, A.D.N.: Diagnosisof alzheimer’s disease using universum support vector machine basedrecursive feature elimination (usvm-rfe). Biomedical Signal Processingand Control 59, 101903 (2020)

147. Sartakhti, J.S., Afrabandpey, H., Saraee, M.: Simulated annealing leastsquares twin support vector machine (SA-LSTSVM) for pattern classifi-

Page 41: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 41

cation. Soft Computing 21(15), 4361–4373 (2017)148. Scholkopf, B., Tsuda, K., Vert, J.P.: Support Vector Machine Applica-

tions in Computational Biology. MIT Press (2004)149. Shahdoosti, H.R., Hazavei, S.M.: Combined ripplet and total variation

image denoising methods using twin support vector machines. Multime-dia Tools and Applications 77(6), 7013–7031 (2018)

150. Shao, Y.H., Chen, W.J., Deng, N.Y.: Nonparallel hyperplane supportvector machine for binary classification problems. Information Sciences263, 22–35 (2014)

151. Shao, Y.H., Chen, W.J., Huang, W.B., Yang, Z.M., Deng, N.Y.: Thebest separating decision tree twin support vector machine for multi-classclassification. Procedia Computer Science 17, 1032–1038 (2013)

152. Shao, Y.H., Chen, W.J., Wang, Z., Li, C.N., Deng, N.Y.: Weighted linearloss twin support vector machine for large-scale classification. Knowledge-Based Systems 73, 276–288 (2015)

153. Shao, Y.H., Chen, W.J., Zhang, J.J., Wang, Z., Deng, N.Y.: An efficientweighted Lagrangian twin support vector machine for imbalanced dataclassification. Pattern Recognition 47(9), 3158–3167 (2014)

154. Shao, Y.H., Deng, N.Y.: A coordinate descent margin based-twin supportvector machine for classification. Neural Networks 25, 114–121 (2012)

155. Shao, Y.H., Deng, N.Y.: A novel margin-based twin support vector ma-chine with unity norm hyperplanes. Neural Computing and Applications22(7-8), 1627–1635 (2013)

156. Shao, Y.H., Deng, N.Y., Yang, Z.M.: Least squares recursive projec-tion twin support vector machine for classification. Pattern Recognition45(6), 2299–2307 (2012)

157. Shao, Y.H., Deng, N.Y., Yang, Z.M., Chen, W.J., Wang, Z.: Probabilisticoutputs for twin support vector machines. Knowledge-Based Systems 33,145–151 (2012)

158. Shao, Y.H., Wang, Z., Chen, W.J., Deng, N.Y.: Least squares twinparametric-margin support vector machine for classification. AppliedIntelligence 39(3), 451–464 (2013)

159. Shao, Y.H., Wang, Z., Chen, W.J., Deng, N.Y.: A regularization for theprojection twin support vector machine. Knowledge-Based Systems 37,203–210 (2013)

160. Shao, Y.H., Yang, Z.X., Wang, X.B., Deng, N.Y.: Multiple instance twinsupport vector machines. Lect Note Oper Res 12, 433–442 (2010)

161. Shao, Y.H., Zhang, C.H., Wang, X.B., Deng, N.Y.: Improvements on twinsupport vector machines. IEEE transactions on neural networks 22(6),962–968 (2011)

162. Shao, Y.H., Zhang, C.H., Yang, Z.M., Jing, L., Deng, N.Y.: An ε-twinsupport vector machine for regression. Neural Computing and Applica-tions 23(1), 175–185 (2013)

163. Sharma, S., Rastogi, R.: Insensitive zone based pinball loss twin supportvector machine for pattern classification. In: 2018 IEEE SymposiumSeries on Computational Intelligence (SSCI), pp. 2238–2245. IEEE (2018)

Page 42: arXiv:2105.00336v1 [cs.LG] 1 May 2021

42 M. Tanveer et al.

164. Sharma, S., Rastogi, R.: Stochastic conjugate gradient descent twin sup-port vector machine for large scale pattern classification. In: AustralasianJoint Conference on Artificial Intelligence, pp. 590–602. Springer (2018)

165. Sharma, S., Rastogi, R., Chandra, S.: Large-scale twin parametric sup-port vector machine using pinball loss function. IEEE Transactions onSystems, Man, and Cybernetics: Systems (2019)

166. She, Q., Ma, Y., Meng, M., Luo, Z.: Multiclass posterior probability twinsvm for motor imagery EEG classification. Computational Intelligenceand Neuroscience p. 95 (2015)

167. Si, X., Jing, L.: Mass detection in digital mammograms using twin sup-port vector machine-based CAD system. In: 2009 WASE internationalconference on information engineering, vol. 1, pp. 240–243. IEEE (2009)

168. Singh, M., Chadha, J., Ahuja, P., Jayadeva, Chandra, S.: Reduced twinsupport vector regression. Neurocomputing 74(9), 1474–1477 (2011)

169. Singla, M., Ghosh, D., Shukla, K., Pedrycz, W.: Robust twin supportvector regression based on rescaled hinge loss. Pattern Recognition p.107395 (2020)

170. Smola, A.J., Scholkopf, B.: A tutorial on support vector regression.Statistics and Computing 14(3), 199–222 (2004)

171. Soman, S., et al.: High performance eeg signal classification using classi-fiability and the twin svm. Applied Soft Computing 30, 305–318 (2015)

172. Suykens, J., Lukas, L., Van Dooren, P., De Moor, B., Vandewalle, J.,et al.: Least squares support vector machine classifiers: a large scale algo-rithm. In: European Conference on Circuit Theory and Design, ECCTD,vol. 99, pp. 839–842. Citeseer (1999)

173. Tanveer, M.: Smoothing technique on linear programming twin supportvector machines. International Journal of Machine Learning and Com-puting 3(2), 240 (2013)

174. Tanveer, M.: Application of smoothing techniques for linear program-ming twin support vector machines. Knowledge and Information Systems45(1), 191–214 (2015)

175. Tanveer, M.: Newton method for implicit Lagrangian twin support vectormachines. International Journal of Machine Learning and Cybernetics6(6), 1029–1040 (2015)

176. Tanveer, M.: Robust and sparse linear programming twin support vectormachines. Cognitive Computation 7(1), 137–149 (2015)

177. Tanveer, M.: Linear programming twin support vector regression. Filo-mat 31(7), 2123–2142 (2017)

178. Tanveer, M., Gautam, C., Suganthan, P.: Comprehensive evaluation oftwin svm based classifiers on uci datasets. Applied Soft Computing 83,105617 (2019)

179. Tanveer, M., Khan, M.A., Ho, S.S.: Robust energy-based least squarestwin support vector machines. Applied Intelligence 45(1), 174–186 (2016)

180. Tanveer, M., Mangal, M., Ahmad, I., Shao, Y.H.: One norm linear pro-gramming support vector regression. Neurocomputing 173, 1508–1518(2016)

Page 43: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 43

181. Tanveer, M., Pachori, R.B., Angami, N.: Classification of seizure andseizure-free eeg signals using hjorth parameters. In: 2018 IEEE Sympo-sium Series on Computational Intelligence (SSCI), pp. 2180–2185. IEEE(2018)

182. Tanveer, M., Pachori, R.B., Angami, N.: Entropy based features in fawtframework for automated detection of epileptic seizure eeg signals. In:2018 IEEE Symposium Series on Computational Intelligence (SSCI), pp.1946–1952. IEEE (2018)

183. Tanveer, M., Rajani, T., Ganaie, M.: Improved sparse pinball twin svm.In: 2019 IEEE International Conference on Systems, Man, and Cyber-netics (SMC). IEEE (2019)

184. Tanveer, M., Richhariya, B., Khan, R.U., Rashid, A.H., Khanna, P.,Prasad, M., Lin, C.T.: Machine learning techniques for the diagnosis ofalzheimer’s disease: A review. ACM Transactions on Multimedia Com-puting, Communications, and Applications 16(1s), 1–35 (2020)

185. Tanveer, M., Sharma, A., Suganthan, P.: General twin support vectormachine with pinball loss function. Information Sciences 494, 311–327(2019)

186. Tanveer, M., Sharma, A., Suganthan, P.N.: Least squares K-nearestneighbor-based weighted multi-class twin SVM. Neurocomputing (Inpress) (2020)

187. Tanveer, M., Sharma, S., Khan, M.: Large scale least squarestwin SVMs. ACM Transactions on Information Technologyhttps://doi.org/10.1145/3398379 (2020)

188. Tanveer, M., Sharma, S., Rastogi, R., Anand, P.: Sparse support vectormachine with pinball loss. Wiley Transactions on Emerging Telecom-munications Technologies (ETT) https://doi.org/10.1002/ett.3820(2020)

189. Tanveer, M., Shubham, K.: A regularization on Lagrangian twin sup-port vector regression. International Journal of Machine Learning andCybernetics 8(3), 807–821 (2017)

190. Tanveer, M., Shubham, K.: Smooth twin support vector machines viaunconstrained convex minimization. Filomat 31(8), 2195–2210 (2017)

191. Tanveer, M., Shubham, K., Aldhaifallah, M., Nisar, K.: An efficient im-plicit regularized Lagrangian twin support vector regression. AppliedIntelligence 44(4), 831–848 (2016)

192. Tanveer, M., Tiwari, A., Choudhary, R., Jalan, S.: Sparse pinball twinsupport vector machines. Applied Soft Computing 78, 164–175 (2019)

193. Tay, F.E., Cao, L.: Application of support vector machines in financialtime series forecasting. omega 29(4), 309–317 (2001)

194. Tian, Y., Ju, X., Qi, Z.: Efficient sparse nonparallel support vector ma-chines for classification. Neural Computing and Applications 24(5),1089–1099 (2014)

195. Tian, Y., Qi, Z., Ju, X., Shi, Y., Liu, X.: Nonparallel support vectormachines for pattern classification. IEEE Transactions on Cybernetics44(7), 1067–1079 (2014)

Page 44: arXiv:2105.00336v1 [cs.LG] 1 May 2021

44 M. Tanveer et al.

196. Tian, Y., Zhang, Q., Liu, D.: ν-nonparallel support vector machine forpattern classification. Neural Computing and Applications 25(5), 1007–1020 (2014)

197. Tomar, D., Agarwal, S.: Feature selection based least square twin supportvector machine for diagnosis of heart disease. International Journal ofBio-Science and Bio-Technology 6(2), 69–82 (2014)

198. Tomar, D., Agarwal, S.: A comparison on multi-class classification meth-ods based on least squares twin support vector machine. Knowledge-Based Systems 81, 131–147 (2015)

199. Tomar, D., Agarwal, S.: An effective weighted multi-class least squarestwin support vector machine for imbalanced data classification. Inter-national Journal of Computational Intelligence Systems 8(4), 761–778(2015)

200. Tomar, D., Agarwal, S.: Hybrid feature selection based weighted leastsquares twin support vector machine approach for diagnosing breast can-cer, hepatitis, and diabetes. Advances in Artificial Neural Systems p. 1(2015)

201. Tomar, D., Ojha, D., Agarwal, S.: An emotion detection system based onmulti least squares twin support vector machine. Advances in ArtificialIntelligence p. 8 (2014)

202. Tomar, D., Prasad, B.R., Agarwal, S.: An efficient Parkinson diseasediagnosis system based on least squares twin support vector machineand particle swarm optimization. In: 2014 9th International Conferenceon Industrial and Information Systems (ICIIS), pp. 1–6. IEEE (2014)

203. Tomar, D., Singhal, S., Agarwal, S.: Weighted least square twin supportvector machine for imbalanced dataset. International Journal of DatabaseTheory and Application 7(2), 25–36 (2014)

204. Tong, S., Koller, D.: Support vector machine active learning with ap-plications to text classification. Journal of Machine Learning Research2(Nov), 45–66 (2001)

205. Vapnik, V.: The nature of statistical learning theory. Springer science &business media (2013)

206. Wang, D., Ye, Q., Ye, N.: Localized multi-plane twsvm classifier via man-ifold regularization. In: 2010 Second International Conference on Intelli-gent Human-Machine Systems and Cybernetics, vol. 2, pp. 70–73. IEEE(2010)

207. Wang, H., Xu, Y.: Scaling up twin support vector regression with safescreening rule. Information Sciences 465, 174–190 (2018)

208. Wang, S., Chen, M., Li, Y., Shao, Y., Zhang, Y., Du, S., Wu, J.: Morpho-logical analysis of dendrites and spines by hybridization of ridge detectionwith twin support vector machine. PeerJ 4, e2207 (2016)

209. Wang, S., Lu, S., Dong, Z., Yang, J., Yang, M., Zhang, Y.: Dual-tree com-plex wavelet transform and twin support vector machine for pathologicalbrain detection. Applied Sciences 6(6), 169 (2016)

210. Wang, S., Zhang, Y., Liu, G., Phillips, P., Yuan, T.F.: Detection ofAlzheimer disease by three-dimensional displacement field estimation in

Page 45: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 45

structural magnetic resonance imaging. Journal of Alzheimer Disease50(1), 233–248 (2016)

211. Wang, S., Zhang, Y., Yang, X., Sun, P., Dong, Z., Liu, A., Yuan, T.F.:Pathological brain detection by a novel image feature fractional fourierentropy. Entropy 17(12), 8278–8296 (2015)

212. Wang, Z., Shao, Y.H., Bai, L., Deng, N.Y.: Twin support vector ma-chine for clustering. IEEE transactions on neural networks and learningsystems 26(10), 2583–2588 (2015)

213. Wang, Z., Shao, Y.H., Bai, L., Li, C.N., Liu, L.M.: A generalmodel for plane-based clustering with loss function. arXiv preprintarXiv:1901.09178 (2019)

214. Wang, Z., Shao, Y.H., Wu, T.R.: A GA-based model selection for smoothtwin parametric-margin support vector machine. Pattern Recognition46(8), 2267–2277 (2013)

215. Xie, J., Hone, K., Xie, W., Gao, X., Shi, Y., Liu, X.: Extending twin sup-port vector machine classifier for multi-category classification problems.Intelligent Data Analysis 17(4), 649–664 (2013)

216. Xie, X., Sun, S.: Multitask centroid twin support vector machines. Neu-rocomputing 149, 1085–1091 (2015)

217. Xu, Y.: K-nearest neighbor-based weighted multi-class twin support vec-tor machine. Neurocomputing 205, 430–438 (2016)

218. Xu, Y.: Maximum margin of twin spheres support vector machine forimbalanced data classification. IEEE Transactions on Cybernetics 47(6),1540–1550 (2017)

219. Xu, Y., Chen, M., Li, G.: Least squares twin support vector machinewith universum data for classification. International Journal of SystemsScience 47(15), 3637–3645 (2016)

220. Xu, Y., Chen, M., Yang, Z., Li, G.: ν-twin support vector machine withuniversum data for classification. Applied Intelligence 44(4), 956–968(2016)

221. Xu, Y., Guo, R.: A twin hyper-sphere multi-class classification supportvector machine. Journal of Intelligent & Fuzzy Systems 27(4), 1783–1790(2014)

222. Xu, Y., Guo, R., Wang, L.: A twin multi-class classification support vec-tor machine. Cognitive Computation 5(4), 580–588 (2013)

223. Xu, Y., Li, X., Pan, X., Yang, Z.: Asymmetric ν-twin support vector re-gression. Neural Computing and Applications 30(12), 3799–3814 (2018)

224. Xu, Y., Lv, X., Wang, Z., Wang, L.: A weighted least squares twin supportvector machine. J. Inf. Sci. Eng. 30(6), 1773–1787 (2014)

225. Xu, Y., Pan, X., Zhou, Z., Yang, Z., Zhang, Y.: Structural least squaretwin support vector machine for classification. Applied Intelligence 42(3),527–536 (2015)

226. Xu, Y., Wang, L.: A weighted twin support vector regression. Knowledge-Based Systems 33, 92–101 (2012)

227. Xu, Y., Wang, L.: K-nearest neighbor-based weighted twin support vec-tor regression. Applied Intelligence 41(1), 299–309 (2014)

Page 46: arXiv:2105.00336v1 [cs.LG] 1 May 2021

46 M. Tanveer et al.

228. Xu, Y., Wang, L., Zhong, P.: A rough margin-based ν-twin support vectormachine. Neural Computing and Applications 21(6), 1307–1317 (2012)

229. Xu, Y., Wang, Q., Pang, X., Tian, Y.: Maximum margin of twin spheresmachine with pinball loss for imbalanced data classification. AppliedIntelligence 48(1), 23–34 (2018)

230. Xu, Y., Xi, W., Lv, X., Guo, R.: An improved least squares twin supportvector machine. Journal of Information and Computational Science 9(4),1063–1071 (2012)

231. Xu, Y., Yang, Z., Pan, X.: A novel twin support-vector machine withpinball loss. IEEE transactions on neural networks and learning systems28(2), 359–370 (2016)

232. Xu, Y., Yang, Z., Zhang, Y., Pan, X., Wang, L.: A maximum margin andminimum volume hyper-spheres machine with pinball loss for imbalanceddata classification. Knowledge-Based Systems 95, 75–85 (2016)

233. Xu, Y., Yu, J., Zhang, Y.: KNN-based weighted rough ν-twin supportvector machine. Knowledge-Based Systems 71, 303–313 (2014)

234. Xu, Y., Zhang, Y., Yang, Z., Pan, X., Li, G.: Imbalanced and semi-supervised classification for prognosis of aclf. Journal of Intelligent &Fuzzy Systems 28(2), 737–745 (2015)

235. Yang, C., Wu, Z.: Study to multi-twin support vector machines and itsapplications in speaker recognition. In: 2009 International Conferenceon Computational Intelligence and Software Engineering, pp. 1–4. IEEE(2009)

236. Yang, H., Huang, K., King, I., Lyu, M.R.: Localized support vector re-gression for time series prediction. Neurocomputing 72(10-12), 2659–2669 (2009)

237. Yang, H.Y., Wang, X.Y., Niu, P.P., Liu, Y.C.: Image denoising usingnonsubsampled shearlet transform and twin support vector machines.Neural Networks 57, 152–165 (2014)

238. Yang, Z., Pan, X., Xu, Y.: Piecewise linear solution path for pinball twinsupport vector machine. Knowledge-Based Systems 160, 311–324 (2018)

239. Yang, Z., Xu, Y.: Laplacian twin parametric-margin support vector ma-chine for semi-supervised classification. Neurocomputing 171, 325–334(2016)

240. Yang, Z., Xu, Y.: A safe sample screening rule for laplacian twinparametric-margin support vector machine. Pattern Recognition 84, 1–12 (2018)

241. Yang, Z.M., Wu, H.J., Li, C.N., Shao, Y.H.: Least squares recursive pro-jection twin support vector machine for multi-class classification. In-ternational Journal of Machine Learning and Cybernetics 7(3), 411–426(2016)

242. Yang, Z.X., Shao, Y.H., Zhang, X.S.: Multiple birth support vector ma-chine for multi-class classification. Neural Computing and Applications22(1), 153–161 (2013)

243. Ye, Q., Ye, N., Gao, S.: Density-based weighting multi-surface leastsquares classification with its applications. Knowledge and information

Page 47: arXiv:2105.00336v1 [cs.LG] 1 May 2021

Comprehensive Review On Twin Support Vector Machines 47

systems 33(2), 289–308 (2012)244. Ye, Q., Zhao, C., Gao, S., Zheng, H.: Weighted twin support vector ma-

chines with local information and its application. Neural Networks 35,31–39 (2012)

245. Ye, Q., Zhao, C., Ye, N., Chen, X.: Localized twin SVM via convex min-imization. Neurocomputing 74(4), 580–587 (2011)

246. Ye, Q., Zhao, H., Li, Z., Yang, X., Gao, S., Yin, T., Ye, N.: L1-norm dis-tance minimization-based fast robust twin support vector k-plane cluster-ing. IEEE transactions on neural networks and learning systems 29(9),4494–4503 (2018)

247. Ye, Y., Shao, Y., Chen, W.: Comparing inflation forecasts using an ε

-wavelet twin support vector regression. Journal of information & com-putational science 10(7), 2041–2049 (2013)

248. Ye, Y.F., Bai, L., Hua, X.Y., Shao, Y.H., Wang, Z., Deng, N.Y.: Weightedlagrange ε-twin support vector regression. Neurocomputing 197, 53–68(2016)

249. Ye, Y.F., Cao, H., Bai, L., Wang, Z., Shao, Y.H.: Exploring determi-nants of inflation in china based on l1− ǫ-twin support vector regression.Procedia Computer Science 17, 514–522 (2013)

250. Zhang, H., Li, H.: Fuzzy twin support vector machine based on intra-class hyperplane. In: Journal of Physics: Conference Series, vol. 1302, p.032016. IOP Publishing (2019)

251. Zhang, X.: Boosting twin support vector machine approach for MCs de-tection. In: 2009 Asia-Pacific Conference on Information Processing,vol. 1, pp. 149–152. IEEE (2009)

252. Zhang, Y., Wang, S.: Detection of Alzheimer disease by displacementfield and machine learning. PeerJ 3, e1251 (2015)

253. Zhang, Z., Zhen, L., Deng, N., Tan, J.: Sparse least square twin supportvector machine with adaptive norm. Applied Intelligence 41(4), 1097–1107 (2014)

254. Zhao, J., Xu, Y., Fujita, H.: An improved non-parallel universum supportvector machine and its safe sample screening rule. Knowledge-BasedSystems 170, 79–88 (2019)

255. Zhao, Y.P., Zhao, J., Zhao, M.: Twin least squares support vector regres-sion. Neurocomputing 118, 225–236 (2013)

256. Zhen Wang Xu Chen, Y.H.S., Li, C.N.: Ramp-based twin sup-port vector clustering. Neural Computing and Applicationshttps://doi.org/10.1007/s00521-019-04511-3 (2019)

257. Zhong, P., Xu, Y., Zhao, Y.: Training twin support vector regression vialinear programming. Neural Computing and Applications 21(2), 399–407(2012)