data science initiatives at (icmc) usp€¦ · data science areas at icmc-usp andre ponce de leon...
TRANSCRIPT
Data Science Initiatives at (ICMC) USP
André C. P. L. F. de Carvalho
Universidade de São Paulo
André P L F de Carvalho 2
Topics
Introduction
São Carlos
Data Science ICMC-USP
Responsible Data Science
Previous partnerships with UGPN
Conclusion
São Carlos
São Carlos
Andre Ponce de Leon de Carvalho 3
São Carlos
235 km from São Paulo
220,000 inhabitants
Brazilian Capital of Technology
Highest PhD per capita in Latin America
Two public universities and two Embrapa research centres
Startup hub
© André de Carvalho - ICMC/USP 4 Andre Ponce de Leon de Carvalho 4
São Carlos
Andre Ponce de Leon de Carvalho 5
Average temp.: Summer: 240
Winter: 160
ICMC
Institute of Mathematics and Computer Sciences USP São Carlos campus,
São Paulo, Brazil
Four departments: Computer Sciences
Mathematics
Applied Mathematics
Statistics
© André de Carvalho - ICMC/USP 6
Data Science Network
EMAp-FGV
ICMC-USP
Center for Data Science, NYU
University of London
INRIA, France
Big Data Research Center, Chinese University of Hong Kong
Andre Ponce de Leon de Carvalho 7
Data Science and Engineering Consortium
George Mason University, USA
Oregon State University, USA
Universidad Carlos III, Madrid, Spain
Universidad de Santiago de Compostela
Universidad Nacional Autonoma de Mexico, México
INFOTEC-CONACYT, México
Universidad Católica San Pablo, Arequipa, Peru
Universidade de São Paulo, Sao Carlos, Brazil
Universidade do Porto, Portugal
Andre Ponce de Leon de Carvalho 8
Data Science at ICMC-USP
Undergraduate level:
Minor in Data Science for 6 majors
Applied Mathematics
Computer Science
Computer Engineering
Information Science
Statistics
Pure Mathematics
Data Science BSc in final approval stage
André P L F de Carvalho 9
Data Science at ICMC-USP
Graduate level
Professional MSc in Data Science
MBA in Data Science
Business Intelligence and Analytics (with PBS, University of Porto)
Several PhD researchers in Data Science
Last 6 years, 3 of the best PhD Thesis Brazil (Ministry of Education) where in Data Science
All from ICMC-USP
André P L F de Carvalho 10
Data Science at ICMC-USP
Research
Researchers from Applied Mathematics Computer Science and Statistics
USP NAP Research Center in Machine Learning
Center on Mathematics, Statistics and Computer Sciences for Industry (CeMEAI)
1 of 17 Excellence Centers funded by FAPESP
Data Science is 1 of its 3 main areas
Members: 32 PIs, 73 Associated Researchers, 32 Postdocs and 226 PhDs
André P L F de Carvalho 11
Books on Data Science
Andre Ponce de Leon de Carvalho 12
Responsible Data Science
Andre Ponce de Leon de Carvalho 13
CeMEAI
CEPID CeMEAI
Center for Mathematical Sciences Applied to Industry
Knowledge transfer to industries
CTA, UFSCar, UNICAMP, UNIFESP, USP
11 years project, started in 2013
Budget of US$ 15 million
Projects, Workshops with Companies, Professional MSc, Hackathons
© André de Carvalho - ICMC/USP 14
Collaboration with companies
Andre Ponce de Leon de Carvalho 15
Agribusiness
Finance Health
Education
Industry
Environment Energy
Technology
André P L F de Carvalho 16
Governo do Estado de São Paulo
Secretaria da Fazenda
CeMEAI Researchers
S. J. Rio Preto Ribeirão Preto
São Carlos Bauru
Rosana
Botucatu
Itapeva
Campinas
S.J. dos Campos
São Paulo Buri
Pres. Prudente
HPC - Cluster SGI-ICE-X
Andre Ponce de Leon de Carvalho 18
• 104 blades • 20 cores - Intel Xeon E5-
2680v2 • 128 GB RAM
• 1 blade Xeon Phi • 20 + 60 cores • 128 GB RAM
• Storage 175 Tb
• 40 blades • 28 cores - Intel Xeon E5-
2680v4 • 128 Gb RAM
• 6 GPU-Node • GPU Nvidia P100
• 4 Fat node • 16 cores E5-2680v2 • 512 Gb RAM
• Data server 0.5 PB
Data Science areas at ICMC-USP
Andre Ponce de Leon de Carvalho 19
Complex systems
Data stream mining
AutoML
Machine learning Data mining
Robotics
Bayesian Inference
Functional data Modelling
Statistical Quality Control
Classification and Categorical Data Analysis
Survival Analysis
Time varing big data visualization
Time series Data Science
Item Response Theory
Regression Models
Andre Ponce de Leon de Carvalho 20
AutoML tools
Andre Ponce de Leon de Carvalho 21
CreateML Apple
Amazon Rekognition
Paje AutoML
End-to-end AutoML
Main focus
Data pre-processing
Explainable ML
Post-processing
Pipeline
Easily expandable
Andre Ponce de Leon de Carvalho 22
Pajé Time Arrow
Andre Ponce de Leon de Carvalho 23
Ricardo Sovat. Uma Abordagem Híbrida
Baseada em Casos e Redes Neurais.
Uma aplicação: escolha e configuração
de modelos de redes neurais.
Claudia Regina Milaré.
Extração de Conhecimento
de Redes Neurais Artificiais.
Bruno F. de Souza. Meta-
aprendizagem aplicada à
classificação de dados de
expressão gênica.
Rafael G. Mantovani. Use
of meta-learning for
hyperparameter tuning of
classification problems.
Luis Paulo Garcia.
Noise detection in
classification
problems.
Davi P. Santos.
Seleção e controle do
viés de aprendizado
ativo.
André L. D. Rossi.
Meta-aprendizado
aplicado a fluxos
contínuos de dados.
Rodrigo C.Barros.
Automatic design of
decision tree induction
algorithms.
Mariá C. V. Nascimento. Meta-
heurísticas para o problema
de agrupamento de dados em
grafo.
Estéfane G. Lacerda.
Model Selection of RBF
Networks via Genetic
Algorithms.
Pajé Team Docentes
André C P L F de Carvalho, USP André Rossi, UNESP Bruno Campos Pimentel, UFAL Bruno Feres de Sousa, UFMA Jefferson Oliva, UTFPR Jorge Kanda, UFAM Luis Paulo Faina Garcia, UNB Rafael Mantovanni, UTFPR
Colaborador Carlos Soares, UP
Pós-doutorandos Kelly da Silva, Intel Tiago Botari, FAPESP
Técnico superior Davi Pereira dos Santos, FAPESP
Mestrandos Eric Rocha, CNPq Tamires Brito, CNPq
Doutorandos Adriano Rivoli da Silva, UTFPR Douglas Castilho, IFPC Edésio Alcobaça, FAPESP Gean Trindade, CNPq Jonas Kasmanas, FAPESP Moisés Rocha, FAPESP Saulo Mastelini, FAPESP Tiago Cunha, FCT Victor Barella, FAPESP Victor Padilha, FAPESP
Iniciação Científica Felipe Siqueira, CNPq Samuel Tomaz Bastos, CNPq Matheus Sanchez, PRP-USP Luan Icaro Pinto Arcanjo, PRP-USP Rodrigo Martins Pires, PRP-USP Thiago Musico, PRP-USP
© André de Carvalho - ICMC/USP 24
Responsible Data Science
Accountability
Reproducibility
Privacy
Transparency
Explainable AI (XAI)
Fairness
Fair Information Practices
André de Carvalho - ICMC/USP 25
André de Carvalho - ICMC/USP 26
Questions?
Andre Ponce de Leon de Carvalho 27