ncds annual report 2014 v3

7
Annual Report 2014

Upload: ncds-reading-repository

Post on 23-Jul-2016

215 views

Category:

Documents


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Ncds annual report 2014 v3

Annual Report 2014

Page 2: Ncds annual report 2014 v3

Page 2

NCDS: It’s about access and collaborationNCDS Members

Stanley C. Ahalt, PhD. Chair, NCDS Steering Committee Director, Renaissance Computing Institute (RENCI) Professor of Computer Science, UNC Chapel Hill

Corporate:

Deloitte Consulting, LLP

General Electric

IBM

Cisco Systems, Inc.

SAS Institute, Inc.

Academic:

Drexel University

Duke University

North Carolina State University

UNC Chapel Hill

UNC General Administration

UNC Greensboro

UNC Charlotte

Texas A&M University

Nonprofit:

MCNC

RTI International

Government:

National Institute of Environmental Health Sciences

U.S. Environmental Protection Agency

National Consortium for Data Science

100 Europa Drive, Suite 540 Chapel Hill, NC, 27517

919-445-9640 www.data2discovery.org [email protected]

@thencds

The last year has been a busy one for the National Consortium for Data Science. As a new organization dedicated to bringing

together data specialists in business, academia, government and the nonprofit sector, we oper-ate in a space that is new, different, exciting, and sometimes challenging.

While other groups interested in seizing the opportunity of big data have formed across the country, none is as focused on bringing together indus-try leaders and researchers to address these issues together. The NCDS believes that industry leaders who have access to world-class researchers in data science benefit by getting a glimpse at cutting-edge data research, understanding research challenges, and helping researchers focus their work in ways that lead to business innovation and new opportunities. We also believe researchers are enriched by access to people who use and create data in government and in the private sector—people who can help them understand the impact of their work on practical problems that affect efficiency, productivity, and competitiveness.

For those reasons, NCDS events in 2014 focused on bringing together people with different backgrounds and perspectives. Our Data Innovation Showcase gave members and promising students the chance to share their most interesting projects and ideas. Our Data Science Faculty Fellows were selected by a committee of mostly industry members in an effort to align Fellows’ research projects with industry interests.

Of course data science and innovation can only take place if we have talented data specialists to feed a hungry job market and if professionals in all sectors have the skills to succeed in a data-rich world. That’s why education was an important component of NCDS activities in 2014. From our data careers panel discussion, held in collaboration with UNC Career Services, to the Data Matters Short Course Series that attracted more than 110 working professionals, the NCDS is committed to programs that will help our nation meet the big data challenge head on.

This report contains an overview of our activities during the last year. My sincere appreciation goes out to all our members and supporters who made these events successful. As the data deluge continues to grow and present new challenges and opportunities, the NCDS is also growing and building momentum. I invite you to join us on this unique journey. I guar-antee it will be interesting.

Page 3: Ncds annual report 2014 v3

Page 3

RESEARCHNCDS awards fellowships to faculty researchers advancing data science

The NCDS named five faculty members at North Carolina universities as its

inaugural Data Science Faculty Fellows. The Faculty Fellows each received $30,000 to sup-port research projects that ad-dress novel and innovative data science issues. The program aims to enable research, fund prototype development, and facilitate activities that support the NCDS vision of unleashing the power of big data by devel-oping and mastering data sci-ence. The Fellows program also seeks to foster relationships between university researchers and NCDS members, to bridge gaps between research and practice, to promote innova-tive approaches to data sci-ence challenges, and to engage the next generation of data scientists.

Twenty faculty members from seven institutions submitted proposals for the Fellowships, which were reviewed by a com-mittee of NCDS members and supporters. The program was the first official NCDS effort to support scientists involved in research that shows promise

for advancing data science. The NCDS expects to continue the program in 2015. Congratulations to the 2014 Faculty Fellow awardees:

• RajeevAgrawal, PhD, assistant professor, depart-ment of electronics, computer and information technology, North Carolina A & T State University. Designing Sustainable and Domain Neutral Next Generation Data Infrastructure to Advance Big Data Science.

• JaneGreenberg, PhD, professor, School of Information and Library Science, UNC-Chapel Hill, and Director, Metadata Research Center. The Metadata Capital Initiative.

• BlairSullivan, PhD, assistant professor, depart-ment of computer science, North Carolina State University. Tracking Community Evolution in Dynamic Graph Data Using Tree-like Structure.

• WlodekZadrozny, PhD, associate professor, College of Computing and Informatics, UNC-Charlotte. Searchable Repository of Resilience and Sustainability Technologies.

• JustinZahn, PhD, department of computer science, North Carolina A & T State University, COMDET: A Novel Community Detection System for Large Networks.

In addition to furthering the NCDS vision, Data Fellows are expected over time to generate measurable deliv-erables such as new methods, models, applications, or prototypes that can be used to develop larger efforts supported with extramural funding.

For more information, see http://data2discovery.org/data-fellows/.

Rajeev Agrawal

Jane Greenberg

Blair Sullivan

Wlodek Zadrozny

Justin Zahn

Member working groups to launch in late 2014

NCDS member organizations will have the chance to collaborate on data challenges directly re-lated to their organizational objectives through

working groups that span disciplines, business sectors, and the public and private sectors beginning in late 2014.

Working groups will give members opportunities to address data issues from a variety of perspectives and to develop outcomes that impact members, such as white papers, position papers, best practice docu-ments, lectures, panel discussion or special events.

Although details about structure and activities are still being finalized, tentative plans call for small groups of members, and possibly nonmember experts to meet several times over the course of a year. The groups will give members a mechanism for interacting with data experts who are not part of the NCDS and possibly with data science students.

More information about working groups will be pre-sented at the NCDS Fall teleconference and in future newsletters.

Page 4: Ncds annual report 2014 v3

Page 4

EDUCATION

Data-focued businesses are always on the hunt for bright young talent, and students

studying curricula related to data science are anxious to know how their educations can translate into rewarding careers. As an organiza-tion that spans industry and aca-demia, the NCDS has the ability to connect these job providers with job-seeking students. The first NCDS student networking event, held in collaboration with the UNC-Chapel Hill Career Services Office brought more than 100 students with a wide range of backgrounds and interests to UNC’s Hanes Hall on the evening of April 7 to discuss career opportunities in big data and informatics. The students learned from indus-try experts about career opportunities in data science and the skill sets that employers look for. The industry representatives also offered advice to students on fur-thering their education and pursuing internships and job opportunities.

A Data Science Industry Panel from NCDS member institutions kicked off the event. Panelists talked about the data science expertise needed at their organiza-tions and provided advice for landing jobs in data

First student-industry networking event focuses on data careers

Above: Students listen to NCDS panelists at the UNC career event. Left: IBM’s Dianne Fodell talks to students at the event’s net-working session.

science fields. A Q and A session followed. Panelists in-cluded Pat Herbert, SAS; Dianne Fodell, IBM; Monique Morrow, Cisco; and Craig Hill, RTI.

The NCDS plans to offer the event annually, possibly at different NCDS member campuses.

This Fall, the UNC computer science department and University Career Services invite NCDS cor-porate representatives to take part in NCDS Tech

Talks. The talks will allow industry members to share information about career and internship opportunities with their organizations and educate the UNC com-munity about their business objectives and corporate culture.

Tech Talk speakers will offer a general overview of their company, positions currently open with the company, and skills and characteristics they look for in employ-ees. Additionally, they will share current research ques-tions and challenges and discuss the qualities they look for in employees to help address these issues.

Tech Talks provide an opportunity for NCDS members and students to network in a relaxed and convenient environment, discuss current data science interests/

Fall UNC Computer Science Tech Talks to Feature NCDS speakerstrends, and begin the relationship building process that leads to gainful and fruitful employment.

Featuredspeakers:

• October 9: Karen Davis, Vice President, Research Computing Division, RTI International

• October 23rd: Russ Gyurek, Director of Innovation, Office of the CTO, CISCO

Both talks will begin at 5:30 p.m. and last until 7:30 p.m., including time for networking between students and speakers. Both will take place in Sitterson Hall on the UNC Chapel Hill campus.

Page 5: Ncds annual report 2014 v3

Page 5

NCDS co-hosts Data Matters Summer Workshop Series

The NCDS co-sponsored a summer workshop series June 23 - 27 with RENCI and UNC-Chapel Hill’s Odum Institute. The Data Matters Summer

Workshop Series was aimed at business leaders, aca-demic researchers and government officials who could benefit by better understanding how to manage, use, share and store big data. The courses targeted people interested in learning how to leverage the so-called “data deluge” to their benefit, those looking to under-stand how data can be used in their work, and those interested in specific software and data challenges.

The week involved two-day courses on Monday/Tuesday and Thursday/Friday and one-day courses on Wednesday. Classes were conducted at the William and Ida Friday Center for Continuing Education in Chapel Hill, and most also included hands-on lab sessions on the UNC-Chapel Hill campus. Instructors included experts from the Odum Institute, RENCI, University of Massachusetts at Amherst, Saffron Technology, Pennsylvania State University, Duke University and Cisco.

Topics covered during the packed week included data science, its goals, techniques and concepts, strategies for managing big data, social network analysis, data management tools such as Hadoop and SAS, using large-scale data networks, data mining and machine learning, and data visualization.

There were over 110 students in attendance represent-ing 25 different organizations; the majority represent-ing universities.

The week also included a kick-off reception Monday night at Top of the Hill restaurant, where attendees, instructors and invited guests had the opportunity to network more informally with their classmates and experts in the field.

At the reception, UNC-Chapel Hill Executive Vice Chancellor and Provost James Dean welcomed partici-pants and spoke about the value of data. In his words, data is the currency of the 21st century, and those

who learn how to analyze, manage, share and glean knowledge from it will be the leaders of the 21st century. Dean ended by thanking the Data Matters sponsors and

participants.

A second Data Matters short course series is planned for June 2015. For ad-ditional information, visit: http://data2discovery.org/events/10/data-matters-summer-workshop-series/.

UNC Provost James Dean

Page 6: Ncds annual report 2014 v3

Page 6

NCDS provides support to data conferences

As part of its effort to advance data science, the NCDS partners with organizations plan-ning conferences of interest to the field. In

May, the NCDS was a gold-level sponsor of the sec-ond international conference on big data science and computing. Sponsored by the Academy of Science and Engineering, the conference, called BigDataScience, was held at Stanford University in Palo Alto, CA. Justin Zahn, an NCDS Faculty Fellow, served as the confer-ence steering chair and Stan Ahalt, head of the NCDS Steering Committee and director of RENCI, presented a talk on the NCDS and importance of advancing data science. For more on the conference, visit the

BigDataSciencewebsite.

In March 2015, the NCDS will participate in the Data4Decisions conference and exposition, a new national trade show that organizers hope to hold annu-ally at the Raleigh Convention Center. Ahalt is a member of the conference planning committee and the NCDS plans to participate as an exhibitor and sponsor. For more information, visit the Data4Decisions website at http://data4deci-sions2015.com/.

INFRASTRUCTUREData Innovation Showcase features presentations, student posters

The NCDS Data Innovation Showcase brought together NCDS members, NCDS Data Science Faculty Fellows, and talented data science stu-

dents to share ongoing and new innovative data-relat-ed projects, activities and ideas. The event was held May 21 at RENCI headquarters in Chapel Hill.

The Showcase included three components. First, NCDS Faculty Fellows delivered short presentations about their NCDS-supported projects. Later in the day, NCDS members presented on a wide range of topics, includ-ing ongoing data science research, development of new products and services, case studies, and more. Presentations by NCDS members included:

• Judith Cone, special assistant to the Chancellor for Innovation & Entrepreneurship, UNC-Chapel Hill. Developing Data-literate Students.

• Bill Wheaton, director of the Geospatial Science and Technology program, RTI International. Concepts and Applications of Large-Scale Synthetic Human Populations.

• Stanislav Minsker, PhD, visiting assistant profes-sor, Duke. Geometric Median-based Approach to Robust and Scalable Statistical Estimation.

• Russ Gyurek, director, Innovation Labs-CTO Group, CISCO. IoT.

• Pat Herbert, principal systems architect for big data, SAS International. From Interesting to Actionable…Data Science Yields Functional Results.

• Steve Gustafson, Knowledge Discovery Lab man-ager, GE Global Research, Big Industrial Data.

• Claire McPherson, SAS Global Alliances, Deloitte LLP. Deloitte Analytics - Embedding Analytics in Everything We Do.

Students from member institu-tions presented posters on topics related to the NCDS mission dur-ing the Student Poster Session. Posters were reviewed by an NCDS committee, and 14 posters were on display all day. Five students received Best Student Poster awards:

• Michael O’Brien, PhD (May 2017), Computer Science, NC State.

• Angela Murillo, PhD candidate (May 2015), Information & Library Science, UNC-Chapel Hill

• Rebecca Lee, BS-Biology (May 2014), UNC-Chapel Hill.

• Muhammad Suleiman, MS-Information Technology (May 2014), NC A&T State.

• Kristin Garrett, PhD candidate (May 2016), Political Science, UNC-Chapel Hill.

All winning posters received monetary awards and all student participants received certificates of apprecia-tion. The NCDS plans to replicate this event in 2015.

For more, see http://data2discovery.org/innovation-showcase/.

Above: Student Poster Session participants

Page 7: Ncds annual report 2014 v3

Page 7

Data Observatory launches with Dataverse Network and data sets

The NCDS Data Observatory seeks to create a di-verse repository of very large data sets for NCDS members to use and share in support of the mis-

sion of advancing data science. It will provide a place for those interested in the science of data to form a community to exchange tools, approaches, data and other relevant information.

Last fall the implementation team, including a gradu-ate student from UNC Chapel Hill’s computer science department, set up the observatory’s computing envi-ronment and completed test installs of iRODS (the in-tergrated Rule-Oriented Data System) and an instance of a Dataverse Network, a container for research data studies, customized and managed by its owner.

This year, the team created an NCDS-branded Dataverse Network (http://observatory.data2discov-ery.org/dvn/) and uploaded the first two data sets to the network: a North Carolina Digital Elevation Model, and storm surge and wind wave data from the ADCIRC modeling system. They also will document how users create their own Dataverse within the NCDS network and address issues such as account requests and creat-ing use agreements.

As part of an effort to link development of the data observatory with data science education, RENCI Senior Research Software Developer Erik Scott and a graduate student supported big data courses at NC State and UNC-Chapel Hill during the 2013-2014 academic year.

Their work focused on educating students on Hadoop,

Left: RENCI’s EriK Scott (left) shows UNC computer sci-ence students how to work with a data supercomputer designed for data-intensive comput-ing using Hadoop.

a framework for massively parallel data storage and analysis. Scott set up a machine environment and accounts for both classes, presented lectures, and collaborated on creating homework assignments and grading.

The courses were: Statistics 810, Big Data: A Statistical Perspective, taught by Lexin Li, associate professor in the NC State department of statistics; and Information and Library Science 690 - 163, Introduction to Big Data and NoSQL, taught by Arcot Rajasekar, professor in the UNC School of Information and Library Science.

NCDSMission

To advance data science to better enable the U.S. to utilize big data in ways that result in new jobs and indus-tries, advances in healthcare, and transformative discoveries in science.

NCDSGoals

• Engage a broad community of data science experts in business, academia and government.

• Facilitate interaction among data specialists across disciplines and business sectors so that data chal-lenges can be addressed strategically and holistically.

• Support the development of educational programs that will train a new generation of data scientists and develop a data-literate workforce.

• Encourage the development of technical, ethical and policy standards for data.

StayintouchwiththeNCDS

Our electronic newsletter, Data Matters, is published quarterly and more often if needed. Anyone can sign up to receive the newsletter from the homepage of the NCDS website (data2discovery.org).