open suny cote note: a virtual infrastructure for data intensive analysis (vidia)

2
The Open SUNY Center for Online Teaching Excellence July 11, 2014 Volume 1 Issue 3 I would like to share what I know about data analysis Providing undergraduates in SUNY access to high performance computing for data analysis and visualization in all disciplines. Big data is sending ripples through all sectors of society. We track everything. And this trend is leading to a critical need for skilled professionals who can mine and interpret the data. The ability to collect, curate, analyze, and visualize large data sets is becoming critical in all disciplines. I would like to share how the University of Buffalo’s Center for Computational Research and SUNY Oneonta faculty have collaborated to build an environment (VIDIA), accessible to all SUNY students, that allows teaching and research in this emerging new field of “data science.” What is it VIDIA is hosted by the CCR and was made possible through a 2013 SUNY Innovative Instruction Technology Grant (IITG) grant. The VIDIA site is powered by the HUBzero Platform for Scientific Collaboration, originally developed at Purdue University. HUBzero was specifically designed to help a scientific community share resources. Users can upload their own content, launch computations, and view results with an ordinary web browser, without having to download, compile, or install any code. The tools they access are not just web forms, but powerful graphical tools that support visualization and comparison of results. How it works HUBzero is an open-source software platform used to create web sites or “hubs” for scientific collaboration, research, and education. It has a unique combination of capabilities that support science and engineering. In addition to allowing access to hundreds of applications through a web browser, HUBzero technology is a little like YouTube.com in that it allows people to upload content and “publish” to a wide audience. Instead of being restricted to short video clips, it handles datasets, analysis tools, and other kinds of scientific content. In that respect, HUBzero is a little like MIT’s OpenCourseWare, but it also integrates the content with collaboration capabilities. A little like Google Groups, HUBzero lets people work together in a private space where they can share documents and send messages to one another. A little like Askville on Am azon.com, HUBzero lets people ask questions and post responses, but about scientific concepts instead of products. What I did Working with a team of faculty at Oneonta and staff at the CCR I helped deploy this HUBzero environment with carefully selected tools and configurations so that undergraduates in social sciences courses (Sociology and Political Science) could complete assignments in social media analysis. COTE NOTE What I know about Data Analysis Jim Greenberg I am currently the Director of the Teaching, Learning, and Technology Center at SUNY Oneonta. I have worked at SUNY Oneonta 33 years helping to deploy technology in ways that improves teaching and learning (I hope). Along the way I have taught courses in Geographic Information Systems, Advanced Networking, various programming languages, and finally New Media. I’ve guest lectured and given workshops on numerous topics relating to technology over the years. I have served on committees at all levels most recently EDUCAUSE’s EQ Editorial Committee and SUNY’s IITG Reviewer Committee. Personally I am interested in how technology and culture interact, particularly in education. Some of the things I’ve been involved with over the years that I am most proud of are the establishment of the Teaching, Learning, and Technology Center on my campus and being in the room when COA and CIT were conceived. The Center for Online Teaching Excellence The harvesting and analysis of social media is an emerging tool in the social sciences. It has become increasing important that SUNY students have the opportunity to become familiar with these emerging methodologies.

Upload: the-state-university-of-new-york

Post on 01-Apr-2016

215 views

Category:

Documents


1 download

DESCRIPTION

“COTE NOTE” is a companion resource for the monthly speaker series "Fellow Chat" of the Open SUNY Center for Online Teaching Excellence (COTE) Community of Practice. This publication is produced by Open SUNY COTE under the SUNY Office of the Provost to promote this event, feature our innovative online faculty, and to promote effective practices in online teaching and learning.

TRANSCRIPT

Page 1: Open SUNY COTE NOTE: A Virtual Infrastructure for Data intensive Analysis (VIDIA)

The Open SUNY Center for Online Teaching Excellence July 11, 2014 • Volume 1 • Issue 3

I would like to share what I know about data analysis

Providing undergraduates in SUNY access to high performance computing for data analysis and visualization in all disciplines. Big data is sending ripples through all sectors of society. We track everything. And this trend is leading to a critical need for skilled professionals who can mine and interpret the data.

The ability to collect, curate, analyze, and visualize large data sets is becoming critical in all disciplines. I would like to share how the University of Buffalo’s Center for Computational Research and SUNY Oneonta faculty have collaborated to build an environment (VIDIA), accessible to all SUNY students, that allows teaching and research in this emerging new field of “data science.”

What is it

VIDIA is hosted by the CCR and was made possible through a 2013 SUNY Innovative Instruction Technology Grant (IITG) grant. The VIDIA site is powered by the HUBzero Platform for Scientific Collaboration, originally developed at Purdue University. HUBzero was specifically designed to help a scientific community share resources. Users can upload their own content, launch computations, and view results with an ordinary web browser, without having to download, compile, or install any code. The tools they access are not just web forms, but powerful graphical tools that support visualization and comparison of results.

How it works

HUBzero is an open-source software platform used to create web sites or “hubs” for scientific collaboration, research, and education. It has a unique combination of capabilities that support science and engineering. In addition to allowing access to hundreds of applications through a web browser, HUBzero technology is a little like YouTube.com in that it allows people to upload content and “publish” to a wide audience. Instead of being restricted to short video clips, it handles datasets, analysis tools, and other kinds of scientific content. In that respect, HUBzero is a little like MIT’s OpenCourseWare, but it also integrates the content with collaboration capabilities. A little like Google Groups, HUBzero lets people work together in a private space where they can share documents and send messages to one another. A little like Askville on Am azon.com, HUBzero lets people ask questions and post responses, but about scientific concepts instead of products.

What I did

Working with a team of faculty at Oneonta and staff at the CCR I helped deploy this HUBzero environment with carefully selected tools and configurations so that undergraduates in social sciences courses (Sociology and Political Science) could complete assignments in social media analysis.

COTE NOTE

What I know about Data Analysis

Jim GreenbergI am currently the Director of the Teaching, Learning, and Technology Center at SUNY Oneonta. I have worked at SUNY Oneonta 33 years helping to deploy

technology in ways that improves teaching and learning (I hope). Along the way I have taught courses in Geographic Information Systems, Advanced Networking, various programming languages, and finally New Media. I’ve guest lectured and given workshops on numerous topics relating to technology over the years. I have served on committees at all levels most recently EDUCAUSE’s EQ Editorial Committee and SUNY’s IITG Reviewer Committee. Personally I am interested in how technology and culture interact, particularly in education. Some of the things I’ve been involved with over the years that I am most proud of are the establishment of the Teaching, Learning, and Technology Center on my campus and being in the room when COA and CIT were conceived.

The Center for Online Teaching Excellence

“The harvesting and analysis of social media is an emerging tool in the social sciences. It

has become increasing important that SUNY

students have the opportunity to become

familiar with these emerging methodologies.”

Page 2: Open SUNY COTE NOTE: A Virtual Infrastructure for Data intensive Analysis (VIDIA)

The Open SUNY Center for Online Teaching Excellence

COTE NOTE

July 11, 2014 • Volume 1 • Issue 3

Staff The COTE Community Team: Alexandra M. Pickett, Associate Director, SUNY Learning Network; Martie Dixon, Assistant Academic Dean, Distance Learning & Alternate Programs, Erie Community College; Patricia Aceves, Director of the Faculty Center in Teaching, Learning & Technology, Stony Brook University; Lisa Dubuc, Coordinator of Electronic Learning, Niagara County Community College; Christine Kroll, Assistant Dean for Online Education, Graduate School of Education, University at Buffalo; Deborah Spiro, Assistant Vice President for Distance Education, Nassau Community College; Lisa Raposo, Assistant Director and Academic Programs Manager, SUNY Center for Professional Development; Erin Maney, Senior Instructional Designer, Open SUNY

This publication is produced by the Open SUNY Center for Online Teaching Excellence under the SUNY Office of the Provost.

Contact/Questions State University PlazaAlbany, New York 12246

[email protected]

How to Submit Material This publication is produced in conjunction with the COTE “Fellow Chat” speaker series. Please submit a proposal at http://bit.ly/COTEproposal for consideration.

Visit http://commons.suny.edu/cote for more information.

To join COTE, visit http://bit.ly/joinCOTE

How I did itFaculty in the social sciences identified desired student outcomes and we used this as a guide to evaluate software tools. Faculty wanted their students to get a “movie trailer” of what it was like to be a data scientist in their discipline. In addition, they wanted students to be able to test theories that are discussed in class. Three software packages were evaluated, Orange, R, Rapid Miner. All have sophisticated text and data analysis capabilities as well as visualization tools. Rapid Miner was chosen because of its ease of use and the ability for us to prepare processes in advance that undergraduates could use. R is being deployed this summer to expand the tool set for students and faculty.

Faculty at Oneonta, with the help of CCR staff, learned how to deploy resources, use Rapid Miner, and build processes and datasets for students. Instructions, example processes, and data sets were deployed for students.

Students in three courses at Oneonta created their own accounts, downloaded their datasets that they had created using another tool (Trackur) that was acquired under this grant, than ran Rapid Miner processes to analysis and visualize their data. Using these, students prepared reports and presentations as part of their course work. These assignments were designed for students to use data of their own interest and to test theories they had learned about in class.

Why I did itThe harvesting and analysis of social media is an emerging tool in the social sciences. It has become increasing important that SUNY students have the opportunity to become familiar with these emerging methodologies. This environment enables this.

What happened when I did itFor me, this project was a powerful example of the benefits for a four year comprehensive college in collaborating with a university center. A lot happened when I did this. Friendships were formed. Expertise was taken advantage of. The ability of a university center to build a sustainable environment that diverted almost no IT resources on the local campus was demonstrated. Best, we were able to create an accessible, sustainable data analysis and visualization environment that any SUNY can use. Most importantly for SUNY Oneonta, one that is accessible for undergraduates.

What I learnedI learned that open source can and does work, and that it is possible to build out an environment on HUBzero technology that can replace much of what local academic computing people try to deploy and support. If nothing else, I learned that SUNY should try and incentivize taking advantage of the university center resources. Our Universities are capable of deploying IT environments that other SUNY campuses cannot and we can take advantage of this.

How others can use itIf you want to try out this environment go to http://vidia.ccr.buffalo.edu and register for an account. Once you have an account, follow the instructions at: https://vidia.ccr.buffalo.edu/resources/42/download/icebreaker-RM-instructions-v3.txt This will lead you through a basic text analysis using prepared data and processes in Rapid Miner. You can also look through the resources posted in the environment or contact me at [email protected] and Jim be happy to get you started.

This publication is disseminated under

the creative commons license Attribution- Noncommercial-Share Alike 3.0