anand kr gupta
TRANSCRIPT
ANAND KUMAR [email protected], [email protected]
www.linkedin.com/in/anand1anand9911555838, 9015750105
Looking for career in Data Science, Machine Learning, Data Analytics
Core Competencies and Strengths
Knowledge and experience of web crawling, web scraping and wrangling/munging. Knowledge and experience of word clouds and sentiment analysis. Hands on experience of Machine learning algorithms (Supervised and Unsupervised) Knowledge and experience of Regression Analysis (OLS, GLM), ANOVA and Correlation. Knowledge of test of significance (t, z and chi square). Proficiency in computer and IT systems – Excellent in Descriptive and Inferential Statistics.
Key Skills
Scripting/Programming : R, Python, C and Data Structure using C. Tools and Packages : RStudio, Canopy, Scikit, NetworkX, SPSS
Rapid Miner, Weka, Pajek, Gephi, NodeXL. Database : MySQL, SQLite, Advance Excel, MS Access. Operating System : Windows, UNIX / LINUX (UBUNTU). Web : HTML, XML, JAVASCRIPT.
Work Experience
1. Organisation : Jaypee Institute of Information Technology, NoidaDesignation : Research AssistantJob Profile : Data crawling, web scraping, data cleaning and pre-processing, statistical
modelling and analysis, algorithm design and simulation, conducting lab sessions for programming, data analysis.
Period : Jul 2013 to till date.
2. Organisation : Nex-G Exuberant Solutions Pvt. Ltd, Noida.Designation : Corporate Trainer.Job Profile : Training and assistance in programming for Protocol Stack Development.Period : Feb 2013 to May 2013.
3. Organisation : Linkers India.Designation : Trainer.Job Profile : Training and assistance in programming, database and operating system.Period : Apr 2011 to Dec 2012.
4. Organisation : Planman Technology.Designation : Project Exwcutive.Job Profile : HTML/XML.Period : June 2010 to Mar 2011.
Projects
1. Link Prediction in Ego Networks using Naïve Bayes
Platform and Skills: R, Excel, RStudio, Python, Canopy, Notepad++.Packages: e1071, ROCR, XLConnect, networkX.Algorithms: Naïve Bayes.Performance Testing: ROC Curve, Precision, Recall, k-fold cross validation.
Brief: Data set contains the connectivity information of users in online social networks. We have to predict the future possible connection among users on the basis of its current connectivity information. We treated the link prediction problem considering it as a binary classification problem and applied Naïve Bayes machine learning technique in order to predict links in ego networks.
Major Task involve in the project were:• Defining the problem statements.• Collection of data.• Pre-processing of data.• Extraction of common users and common features between already connected and disconnected users.• Applied machine learning Naïve Bayes technique for predicting missing links in ego networks.• Validation of result through k-fold cross validation and ROC curve.• Plotting the output as ROC curve.• Final out put in a csv file and report in pdf file.
2. Analysis of Link Prediction and Impact of Structural Properties of Networks
Platform and Skills: R, Excel, RStudio, Python, Anaconda, NodeXL, Notepad++.Packages: XLConnect, networkX.Algorithms: Correlation, Jaccard Index, Cosine Similarity, Adamic-Adar Index, RA, HPA, HDA.Performance Analysis: Correlation, Precision, Recall, PPV (Positive Predicted Values).
Brief: Variety of network data sets is collected from different sources. We have to analyses the impact of structural properties over node neighborhood based (similarity score based) link prediction techniques.
Major Task involve in the project were:• Defining the problem statements.• Collection of data.• Pre-processing of data.• Extraction of common users connected and disconnected users.• Develop a model to analyze the performance of link prediction techniques.• Develop a model to check the different structural properties of the networks.• Validating the model by applying correlation.• Plotting the output as line graph.• Final out put in a csv file and report in pdf file.
3. Prediction of Future Links in Online Social Networks
Platform and Skills: Python, Anaconda, Excel, Notepad++.Packages: networkx, scikit-learn, NumPy, SciPy, matplotlib.Algorithms: Jaccard Coefficient, Cosine Similarity, Adamic Adar Index, Resource Allocation Index.Performance Testing: TPR, FPR, Precision, Recall, F measure.
Brief: Data set contains the connectivity information of users in online social networks. We have to predict the future possible connection among users on the basis of its current connectivity information.
Major Task involve in the project were:• Defining the problem statements.• Collection of data.• Pre-processing of data.• Extraction of common users between already connected and disconnected users.• Develop a model for calculating the similarity score between each individual.• Develop a model to calculate the threshold in order to predict the links.• Plotting the output.• Final out put in a csv file.
4. Twitter Sentiment Analysis
Platform and Skills: R, Excel, RStudio, Notepad++Packages: twitteR, stringr, ggplot2, plyrPerformance Testing: Precision, Recall, F Measure
Brief: Streaming live data from twitter with particular #tag and @mention using twitteR package in R. Pre-processing and analysis of the data on scale of Negative, Neutral and Positive.
Major Task involve in the project were:• Defining the problem statements.• Streaming live twitter data with #tag and @mention. • Data cleansing.• Apply text mining to obtain relevant problem associate them with their actionable insights.
• Dictionary creation of negative and positive words.• Develop a model for calculating the polarity of each row in data frame.• Function to calculate the sentiments on scale of Negative, Neutral and Positive.• Plotting the output.• Final out put in a text file containing graphs and reports.
5. Text mining and Sentiment Analysis
Platform and Skills: R, Excel, RStudio, Notepad++.Packages: Rcurl, wordcloud, tm, NLP, plyr, stringr, ggplot2, SnowballC.Performance Analysis: Wordcloud, Term Frequency, Association.
Brief: Data file contains first speech of the chief minister of Delhi Mr. Arvind Kejriwal in text format. We have to analyze the speech by applying text mining over the collected data.
Major Task involve in the project were:• Defining the problem statements.• Collection of data.• Data cleansing.• Text transformation and mapping.• Text stemming.• Develop a model for calculating the term document matrix.• Function to plot the word cloud and association between words.• Plotting the output as a word cloud.• Final out put in a text file containing word cloud graphs and reports.
6. Web Scraper and Crawler
Platform and Skills: Python, Canopy, R, RStudio, Excel, Notepad++.Packages: BeautifulSoup, requests, csv, twitter, RSQLite, rvest, XML, plyr, stringr.Number of Websites: 9 including twitter.
Brief: Our task is to collect the data from different websites for analysis purpose. Data changes periodically on the selected websites and we have to collect the data from those sites for different time intervals.
Major Task involve in the project were:• Analysis of web and structure of the data.• Writing python and R scripts to crawl and scrape the data from different websites.• Pre-processing of data.• Store the final processed data into csv, text, Excel and sql files.
Research Publications
International Journal (Communicated): Link Prediction in online social network using machine learning binary classifier techniques (Scopus). Naïve Bayes Approach for Predicting Missing Links in Ego Networks (Scopus). Prediction of Missing Links in Social Networks: FINN (Feature Integration with Node Neighbor) (Scopus). Performance of Link Prediction Techniques Using Dynamic Threshold over Ego Networks (Scopus).
International Conference (Published): Gupta, Anand Kumar; Sardana, Neetu, "Significance of Clustering Coefficient over Jaccard Index,"
in Contemporary Computing (IC3), 2015 Eighth International Conference on , vol., no., pp.463-466, 20-22 Aug. 2015.(IEEE, Scopus, DBLP indexed).
Gupta, A.K.; Sardana, N., "Impact of topological properties over link prediction based on node neighbourhood: A study," Contemporary Computing (IC3), 2014 Seventh International Conference on, vol., no., pp.194-198, 7-9 Aug. 2014. (IEEE, Scopus, DBLP indexed).
Conferences/Seminars/Workshop Attended
Two days workshop on “Data Analytics and its Security Issues” (04-05 December 2015), JIIT Noida. One day workshop on “Plagiarism and Reference Management using Mendeley” (16 October 2015), JIIT
Noida. One week workshop on “Research Competency and Development Programme” (14-19 September 2015),
JBS Noida. Eighth International Conference on Contemporary Computing (20-22 August 2015), JIIT Noida One Day Workshop on “Advanced Optimization Techniques” (23 Sep 2014), JIIT Noida.
Seventh International Conference on Contemporary Computing (07-09 August 2014), JIIT Noida. Six Day FDP on “Embedded system” (28 Jul-02 Aug 2014), JIIT Noida. One Day FDP on "Data Mining and Social Media Analytics - Emerging Trends and Challenges" on 21st
December 2013, BVICAM New Delhi. Sixth International Conference on Contemporary Computing (08-10 August 2013), JIIT Noida.
Educational Qualifications
Ph. D. (Pursuing) : JIIT (Department of CSE/IT), Noida, 2013-Till Date.Research Area : Online Social Network Analysis (Link Prediction)
MCA : Institute of Management Education, UPTU Lucknow, 2007-10. B.Sc. : D.D.U. Gorakhpur University, in Mathematics and Statistics, 2004-07. XII : U.P. Board in Math, Physics, Chemistry, 2002-04. X : U.P. Board in Science, 2000-02.
Achievement & Extra Curricular Activities
UGC NET (Computer Science and Application) - DEC 2012. Volunteer work for homeless people with NGO ‘Rain Basera’ governed by DUSIB. Awarded as Best Class Representative in MCA. Awarded as an Excellent Performer for Managing various event in collage fest. Best Student Award in NSS (National Service Scheme) camp during graduation.
Personal Vitae
Passport : M0257293 (Date of Expiry 21-07-2024)Address : S-534, 2nd School Block, Shakarpur, Laxminagar, New Delhi-92Date of Birth : 06th March 1988Gender : MaleMarital Status : SingleLanguage Known : English, HindiHobbies : Cycling, Photography, Travelling, Playing Guitar, SingingPassion : Teaching
Declaration
I hereby declare that the information furnished above is true and complete to the best of my knowledge and belief.
Date: Place: Anand Kumar Gupta