f18 presentation book (original)€¦ · amazon and starbucks, two of america's most beloved...

14
MASTER’S PRESENTATIONS FALL 2018 Thursday, December 13, 2018 9:00 am – 11:00 am Room KC 2204

Upload: others

Post on 24-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: F18 Presentation Book (original)€¦ · Amazon and Starbucks, two of America's most beloved brands, use data analysis every day to help understand their consumers. Mature data analysis

MASTER’S PRESENTATIONS

FALL 2018

Thursday, December 13, 2018

9:00 am – 11:00 am Room KC 2204

Page 2: F18 Presentation Book (original)€¦ · Amazon and Starbucks, two of America's most beloved brands, use data analysis every day to help understand their consumers. Mature data analysis

SCHOOL OF CIS

FALL 2018

MASTERS PRESENTATIONS

Thursday, December 13, 2018

Schedule of Presentations KC 2204: 9:00 am - Three to Five Minute Lightning Rounds Daniel Lindeman – MS Project, Advisor: Dr. D. Robert Adams “Puzzle Level Generation with Answer Set Programming” Katherine Skocelas – MS Project, Advisor: Dr. D. Robert Adams & Dr. Bryon DeVries “Systemic Lupus Erythematosus Symptom Severity Prediction Using a Recursive Neural

Network” Ryan Solnik – MS Project, Advisor: Dr. D. Robert Adams

“Story Parsing and Adventure Generation with Python and Postgres” Dipana Sorathiya – MS Project, Advisor: Dr. D. Robert Adams “Graphical Log Analyser” Evelyn Edwards – MS Project, Advisor: Dr. Andrew Kalafut “The Insecurity of Things (IoT)”

Achyutarama Ganti – MS Project, Advisor: Dr. Jared Moore “Stock Market Analysis Using Machine Learning Algorithms”

Debaditya Gautam – MS Project, Advisor: Dr. Christian Trefftz

“A Parallel Algorithm to Calculate an Approximation to the Order-K Voronoi Diagram” Brett VanderHaar – MS Project, Advisor: Dr. Jonathan Leidig

“A Framework for Discovering Latent Insights in Clinical Data” Nicolás Arias González – MS Project, Advisor: Dr. Jonathan Engelsma

“Web-Based, Deep Learning Assisted Medical Image Tagging Tool” David Dick – MS Project, Advisor: Dr. Jonathan Engelsma “Development of a Mobile Friendly Self-Service Experience at Grand Rapids Community

College” Joseph McCartney – DSA Internship, Internship Supervisor: Mr. Aaron Kamphuis

“Open Systems Technologies Data Analyst Internship: AWS Recommendation System”

Samantha Milano – DSA Internship, Internship Supervisor: Mr. Josh Schwannecke “Amway Data Science Internship: An Analysis of the Sky Air Filtration System and Indoor Air Quality”

Sixty-minute poster presentations to immediately follow.

Page 3: F18 Presentation Book (original)€¦ · Amazon and Starbucks, two of America's most beloved brands, use data analysis every day to help understand their consumers. Mature data analysis

Puzzle Level Generation with Answer Set Programming Masters Project

Presented By: Daniel Lindeman

Advisor: Dr. Robert Adams

Abstract: Swappy is a puzzle game that requires different character tokens to cooperatively navigate a maze to reach their goals. Swappy characters are special in that whenever they are collinear with another character, they may swap places. In practice, generating levels manually may take upwards of 20 hours, and is error prone. By employing Answer Set Programming (ASP), it is possible to generate and constrain level creation such that levels are solvable, meet an aesthetic standard, and follow the rules of the game. Using the grounder/solver tool, Clingo, level creation can be done in a matter of seconds or minutes. The expressive power of rules and constraints allows the developer to more clearly see their game for the abstract ruleset that it is. In this project we explore the use of ASP Prolog to generate artifacts useful for level generation for the puzzle game Swappy - finding succinct and expressive ways to do so compared to traditional programming languages.

Page 4: F18 Presentation Book (original)€¦ · Amazon and Starbucks, two of America's most beloved brands, use data analysis every day to help understand their consumers. Mature data analysis

Systemic Lupus Erythematosus Symptom Severity Prediction Using a Recursive Neural Network

Masters Project

Presented By: Katherine G. Skocelas Advisors: Dr. Robert Adams & Dr. Byron DeVries

Abstract:

Systemic Lupus Erythematosus (SLE) is a chronic autoimmune disease that causes the immune system to attack the body’s own connective tissues and organs. Humans have difficulty predicting SLE symptom severity levels because of the complex interactions of disease trigger exposure levels over time. To address this issue, we constructed a novel machine learning solution that generates a model capable of predicting SLE symptom severity levels with 8.3-19.9% average error. It does so by inputting trigger exposure levels into a recursive neural network and training them with a unique method that continually turns training on and off based on the maximum error each day. This allows the RNN to learn SLE flare activity without overtraining remission activity, thus maintaining a greater degree of plasticity. Models trained in this fashion performed 3.5-5% better on average than those trained via the standard method. Future areas of work include replicating these results with a large patient training data set and evolving the model to predict disease trajectory.

Page 5: F18 Presentation Book (original)€¦ · Amazon and Starbucks, two of America's most beloved brands, use data analysis every day to help understand their consumers. Mature data analysis

Story Parsing and Adventure Generation with Python and Postgres Masters Project

Presented By: Ryan Solnik Advisor: Dr. Robert Adams

Abstract: Dungeons and Dragons is a tabletop roleplaying game that allows players to assume the roles of adventurers in medieval fantasy setting while one player is tasked as playing the role of the Dungeon Master (DM). This player facilitates the story and all other characters not played by the other players. Adventure Day is a toolset for Dungeons and Dragons 5th Edition that assists Dungeon Master in formatting their Story as well as gathering useful details for the challenges presented within that adventure. Adventure day aims to accomplish this by associating relevant monster data from postgres database while using the text input of a desired adventure parsed through a Python application to filter for highlighted terms and take action based on a defined dictionary within the same text file. Adventure day will be capable of building encounters with monsters specified by monster environment and/or abilities that they possess, provide inspiration for non-player characters that the DM will portray via personality traits, and finally help to scale monster difficulty depending on the power level of the player’s characters. Adventure day will work in tandem with a helper site that formats Markdown text into a format that mirrors that of published material.

Page 6: F18 Presentation Book (original)€¦ · Amazon and Starbucks, two of America's most beloved brands, use data analysis every day to help understand their consumers. Mature data analysis

Graphical Log Analyzer Masters Project

Presented By: Dipana Sorathiya

Advisor: Dr. Robert Adams

Abstract: In an automotive embedded system Inter Process Communication (IPC) messages are exchanged between various software modules. These messages (or signals) are exchanged at high frequencies and recorded in a text-based log file. Unfortunately, this format is generally tedious to read and difficult to analyze. Graphical Log Analyzer is a tool, which parses the debug, logs (output) from an automotive embedded system and converts them into a graphical representation so they can become easily understandable for system debugging purpose. Graphical Log Analyzer parses the software modules names and message names, and identifies the direction of message transmission and the end point of reception. It then generates a “.diag” file that is processed by the seqdiag too to produce sequence-diagram. This visual representation of software module exchanges allows developers to more easily debug complex embedded software issues.

Page 7: F18 Presentation Book (original)€¦ · Amazon and Starbucks, two of America's most beloved brands, use data analysis every day to help understand their consumers. Mature data analysis

The Insecurity of Things (IoT) Masters Project

Presented By: Evelyn Edwards Advisor: Dr. Andrew Kalafut

Abstract:

Convenience is important to everyone. In our fast-paced society, people are willing to pay for devices that can save them time, even if it is just a few minutes. Over the past few years, the Internet of Things (IoT), or smart devices, have become a popular way for people to leverage technology in order to save them time. These devices can be used in every area of a home, including the entryways, the kitchen, and the living room.

While all of these devices make daily life more convenient, their lack of security makes hacker’s lives more convenient, too. The majority of IoT devices lack basic security features and most consumers install the devices in their homes with the default settings. This provides cyber criminals with the means to hack into a system with minimal time and effort.

I focused on the security of a popular smart device, the smart light bulb. I compared the security features of two different smart light bulbs by running a series of penetration tests against them. The main aspects of the light bulbs that were tested include the phone application that controlled the light bulb and the Bluetooth protocol that the phone application used to communicate with the light bulb. These tests show the lack of security in common IoT devices is a serious problem that cyber criminals could take advantage of.

Page 8: F18 Presentation Book (original)€¦ · Amazon and Starbucks, two of America's most beloved brands, use data analysis every day to help understand their consumers. Mature data analysis

Stock Market Analysis using Machine Learning Algorithms Masters Project

Presented By: Achyut Ganti Advisor: Dr. Jared Moore

Abstract: The stock market moves a large amount of wealth between individuals and institutions daily. Forty million transactions, involving 10 billion shares, are exchanged in the US market alone everyday. In the past twenty years, computers have dominated transaction volume, processing information at a rate inconceivable to human traders. Machine learning (ML) algorithms have gained traction for their ability to digest data and formulate predictions in many domains. This project investigates three ML algorithms as applied to publicly available stock market data to predict future prices. The goal of this project is to observe how these algorithms perform on a dataset consisting of historical prices for one company and their accuracy was compared against each other. The closing price of the data for each day was selected as the target variable. The algorithms predicted the close price for the next thirty days based on the input data. A “good” algorithm would predict closing prices close to the actual values. Results show that a Random Forest outperforms Neural Networks and Linear Regression for our dataset. While initially promising, during the project I noticed that the models do not perform as well as would be expected from ML. Investigating further, the dataset was gathered from freely available sources on the Internet. After speaking with veteran stock traders, it appears that publicly available data is not enough to produce an effective stock trading model. Going forward, I would need to acquire data from paid sources as it includes features such as climate models, public sentiment analysis, and forward-looking business climate information. Still, the project was successful in applying ML to stock market price prediction, although it is not a fully functioning trading algorithm.

Page 9: F18 Presentation Book (original)€¦ · Amazon and Starbucks, two of America's most beloved brands, use data analysis every day to help understand their consumers. Mature data analysis

A Parallel Algorithm to Calculate an Approximation to the Order-K Voronoi Diagram

Masters Project

Presented By: Debaditya Gautam Advisor: Christian Trefftz

Abstract: In this project, a parallel algorithm has been developed to find an approximate solution to the Order-K Voronoi

Diagram. The results hence obtained from the implementation with the use of java streams have been elaborated in

the following sections. The implementation and experiments conducted are based on a number of finite points. The

algorithm formulated in this project has been developed on the basis of implementation on Graphical Processing

units.

Figure 1 Sample Voronoi Diagram Figure 2 Order K Voronoi Diagram k=2

Page 10: F18 Presentation Book (original)€¦ · Amazon and Starbucks, two of America's most beloved brands, use data analysis every day to help understand their consumers. Mature data analysis

A Framework for Discovering Latent Insights in Clinical Data Masters Project

Presented By: Brett VanderHaar

Advisor: Dr. Jonathan Leidig

Abstract: Amazon and Starbucks, two of America's most beloved brands, use data analysis every day to help understand their consumers. Mature data analysis frameworks allow organizations to uncover latent insights about their consumers behavior. These same data analysis techniques are not as common in the healthcare industry. With healthcare growing at a substantial rate, it is paramount healthcare organizations begin utilizing data analysis to improve patient care. By partnering with a local healthcare organization, it was possible to use exploratory data analysis to find key areas where business and clinical leaders could answer complex questions. The research was done utilizing Python which combined data from multiple sources, analyzed the data, and prepared the data for further analysis. R was used to explore, extract, and create visual representations of the data. Conclusions varied in significance and nature with some of the findings creating an appetite for further utilization of data analysis, some findings being suggestive but not conclusive and requiring further exploration, and some ideas produced no concrete findings due to data and timing constraints. Due to the confidential and proprietary requirements of the organization partnership, the research expands on broad ways data analysis can be used to improve patient care.

Page 11: F18 Presentation Book (original)€¦ · Amazon and Starbucks, two of America's most beloved brands, use data analysis every day to help understand their consumers. Mature data analysis

Web-Based, Deep Learning Assisted Medical Image Tagging Tool Masters Project

Presented By: Nicolás Arias González

Advisor: Jonathan Engelsma

Abstract: One of the biggest challenges when building supervised machine learning models is to obtain the desired dataset along with its respective annotations. This is especially true in the medical field where all data produced is expected to be consumed by a human being instead of a machine. More often than not, the data can be found only by itself and data scientists are burdened with the task of manually creating the tags for it, a tedious and time-consuming task. This project aims to speed up the process of manually annotating regions of interest (ROI) in images from computed tomography (CT) scans by leveraging fully convolutional deep networks and web technologies. A partially trained deep learning model suggests ROI to the user who evaluate and adjust them. These corrected images can then be fed to the model as ground truths to continue training. The end result of this process is the tagged dataset and a fully trained machine learning model for predicting ROI in CT scans. In an experiment performed with the help of a medically trained volunteer, tagging images aided by a model trained with 2.3% of the dataset resulted in a 7x speedup over the manual process.

Page 12: F18 Presentation Book (original)€¦ · Amazon and Starbucks, two of America's most beloved brands, use data analysis every day to help understand their consumers. Mature data analysis

Development of a Mobile Friendly Self-Service Experience at Grand Rapids Community College

Masters Project

Presented By: David Dick Advisor: Jonathan Engelsma

Abstract: Computer use reflects the development of technology from the powerful desktop computers, to portable laptops, to small handheld smartphones. Users today want the capability to perform tasks from anywhere at any time with any device. In order to meet these demands and stay relevant, organizations must adopt and implement updated technologies. This project focuses on a need to adjust to the technological shift at Grand Rapids Community College. The college’s self-service system, originally developed in the early 2000’s, no longer met the needs of the campus community. Especially mobile phone users were unsatisfied with the experience. To solve the problems with the current system, this project leveraged web development tools and new programming capabilities, recently added to PeopleSoft, to create a responsive experience for all users regardless of the device. The new interface was successfully deployed on November 12, 2018, and continues to serve the college community.

Page 13: F18 Presentation Book (original)€¦ · Amazon and Starbucks, two of America's most beloved brands, use data analysis every day to help understand their consumers. Mature data analysis

Open Systems Technologies Data Analyst Internship: AWS Recommendation System

DSA Internship

Presented By: Joe McCartney Advisor: Aaron Kamphuis

Abstract: Recommendation systems are a useful tool in modern day technology with their ability to statistically analyze data to predict products a customer would be interested in. They can be built in many different ways including using the cloud computing platform of Amazon Web Services (AWS). The goal of this project was to create a data pipeline of a recommendation system for one of Open Systems Technologies clients, Herman Miller, using AWS. Herman Miller does not always sell directly to consumers and instead sometimes use dealers as a middleman. These dealers use filtering of Herman Miller’s catalog along with their prior knowledge to recommend products to consumers. A recommendation system allowed for more accurate products to be shown to customers based on their specifications. Using multiple functions within AWS including RDS, Glue, S3, SageMaker, Lambda, and CloudFormation, a continuous flow of data was created for this recommendation process. The data pipeline was both able to provide dealers with the product they should recommend to Herman Miller customers based on their wants and needs as well as it could be easily replicated in any AWS space for future use.

Page 14: F18 Presentation Book (original)€¦ · Amazon and Starbucks, two of America's most beloved brands, use data analysis every day to help understand their consumers. Mature data analysis

Amway Data Science Internship: An Analysis of the Sky Air Filtration System and Indoor Air Quality

DSA Internship

Presented By: Samantha Milano Advisor: Josh Schwannecke

Abstract: Indoor air filtration systems can be a vital tool in homes across the globe, depending on the quality of the air within the home. Over the past years, Amway has developed an indoor air filtration system, Atmosphere, that has proven effective removal and reduction of critical particles that threaten the health of individuals within the home. While indoor air filtration systems can aid almost any home, there may be a stronger demand for air filtration systems in specific regions. In order to identify these regions, the relationship between indoor and outdoor air quality was explored. Telemetry data from the new Sky unit within the Atmosphere line was collected over the span of a year from various states across the country. With the fast-approaching release of Sky in other regions, this data was used to gather information on indoor air quality status under various conditions. Outdoor air quality was represented using a standard measurement of air quality called AQI (Air Quality Index). Outside resources were used to identify supplementary data that may explain the relationship between indoor and outdoor air quality. Using the sum of this data, a regression model was generated, with the factors in the final model representing the most influential factors to the relationship between indoor and outdoor air quality. Where the relationship is strongest, we can observe historic AQI patterns to predict optimal locations and time intervals in which individuals would most need an indoor air filtration unit. Cross-validation simulations were used to access the accuracy and reliability of the model.