Previous Thesis/Project Final Exam Schedule

Previous Quarters

Previous Final Examination Defense Schedule

An archive of some previous School of STEM master’s degree thesis/projects.

Select a master's program to navigate to candidates:

Master of Science in Computer Science & Software Engineering

AUTUMN 2022

Tuesday, November 15

SAMRIDHI AGRAWAL

Chair: Dr. Hazeline Asuncion
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.; Online
Project: Optimized Provenance Reconstruction for Machine-generated data

A lot of data is created, deleted, copied and modified easily over the Internet which makes it difficult to identify the authenticity and credibility of the data.  It is important to reconstruct the provenance of data which has lost its provenance information. There are techniques which helps in recovering the metadata from which provenance can be reconstructed. However, many systems fail to capture provenance due to lack of provenance capture mechanisms such as the Source file repositories or file storage system. The Provenance-Reconstruction approach proposed by the "Provenance and Traceability Research Group” has numerous projects on reconstructing provenance.

The current research (OneSource) captures various reconstruction techniques for machine generated datasets with attributes such as file size, semantic meaning of the content, and word count in the files. OneSource improves provenance reconstruction for git commit history as machine generated datasets. OneSource algorithm uses multi-funneling approach which includes techniques such as data cleaning in python, topic modelling and cosine similarity for clustering, and lineage algorithm with endpoints known to achieve higher accuracy in recovering valid provenance information. OneSource generates ground truth data by extracting commit history and file versions of a git repository. To assess OneSource model performance, the model is evaluated on various datasets with varying data size and count of files. OneSource reconstructs provenance of clusters and relationship of files (cluster derivation) within the cluster. The evaluation results indicate that OneSource can reconstruct provenance of cluster by attaining 90% precision and reconstruct provenance of cluster derivation by attaining 66% precision with cosine similarity as the clustering method. OneSource yields improvement in accuracy of 60% for cluster derivation than the existing technique. In the future, research studies may use parallelization for larger datasets as well as optimizations in lineage algorithm may improve the model performance.

Thursday, December 1

DI WANG

Chair: Dr. Kelvin Sung
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.; Online
Project: Real-time Cloth Simulation

Computer animated cloth is commonplace in video games and film making. Believable cloth animation greatly improves the sense of immersion for video gamers. Simulated cloth can seamlessly blend into live-action footage, allowing filmmakers to generate desired visuals by adjusting parameters. Real-time cloth simulation allows a user to interact with the materials, making the virtual cloth try-on experience possible. All these use cases prefer fast solutions, ideally achieving believable cloth animation in real-time. It is, however, challenging to simulate cloth both accurately and efficiently, due to its infinite deformation scenarios and complex inner mechanics. This project studied two solutions to tackle real-time cloth simulations: 1) an iterative approach using a linear solver, which fully exploits the GPU’s parallel processing architecture and efficiently solves the cloth material with thousands of vertices in real-time. 2) a nonlinear solver, which is based on the Projective Dynamics global-local optimization technique. We implemented an interactive application to demonstrate the two solutions, and assessed their qualities based on generality, correctness, and efficiency of the results. Our results show that both solvers are capable of generating believable real-time cloth animations in a wide range of testing scenarios: they interactively react to the changes in cloth attributes, internal and external forces, can be properly illuminated, texture mapped, and are capable of interacting with other objects in the scene, e.g., proper collision. The known conditions where the solvers could generate incorrect results were investigated: the instability of the linear solver with overly stiff spring constants, and the stiff self-bending of the nonlinear solver resulting in inability to support realistic wrinkles. By experimentation and theoretical reasoning, we identified that the linear solver’s instability is inevitable, while the nonlinear stiff-bending can be improved by a more sophisticated energy definition. To evaluate the both solvers' efficiency, we recorded actual runtime and derived each performance function with respect to the cloth resolutions. Our results verify the expected algorithmic complexity: within GPU supported range, linear solver runtime can maintain constant as the resolution increases. The nonlinear’s running-time grows at a O(N3) rate.


KEVIN WANG

Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Performance and Programmability Comparison of Parallel Agent-Based Modeling (ABM) Libraries

Agent-based modeling (ABM) allows researchers in the social, behavioral, and economic (SBE) sciences to use software to model complex problems and environments that involve thousands to millions of interacting agents. These models require significant computing power and memory to handle the high numbers of agents. Various research groups have implemented parallelized ABM libraries which allow models to utilize multiple computing nodes to improve performance with higher problem sizes. However, it is not clear which of these libraries provides the best-performing models and which is the easiest to develop a model with. The goal of this project is to compare the performance and programmability of three current parallel ABM libraries, MASS C++, RepastHPC, and FLAME. The Distributed Systems Lab at the University of Washington Bothell developed Multi-Agent Spatial Simulation (MASS) C++ as a C++-based ABM library. Different research groups developed RepastHPC and FLAME before MASS C++, and SBE researchers have successfully used these libraries to create agent-based models. To measure performance, we designed a set of seven benchmark programs covering various problems in the SBE sciences, and implemented each of them three times using MASS C++, RepastHPC, and FLAME. We compared the average execution times of the three implementations for each benchmark to determine which library performed the best. We found that certain benchmarks would perform better with MASS C++ compared to RepastHPC, while for other benchmarks the opposite was true. However, we found that across all benchmarks FLAME had the worst performance since it could not handle the same parameters given to the MASS C++ and RepastHPC implementations. To measure programmability, we performed a static code analysis and manual code review of each benchmark implementation to assess the three libraries quantitatively and qualitatively. We found that in terms of quantitative metrics, none of the three libraries was conclusively more programmable than the others. However, MASS C++ and RepastHPC may have more desirable qualities for developing agent-based models compared to FLAME.

Friday, December 2

JASON KOZODOY

Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.; Online
Project: Automobile Retrieval Price Predictive System using LightGbm with Permutation Feature Importance

We see different powered vehicles of hybrid, gasoline, and electric models in the vehicle market.
The varying vehicle features in these types of vehicles creates a unique problem in vehicle
selection and price prediction. We create an automobile retrieval price predictive system to
enable users to access different powered car results that have similar vehicle features. Our
system focuses on selecting multiple powered vehicles for users and predicting prices on the
selected vehicles. Our system also enables similar recommendations based on past vehicle
selections. Our capstone project compares four different regression models: LightGbm,
FastForest, Ordinary Least Squares, and Online Gradient Descent. The four models cover
ensemble machine learning models and linear machine learning models on automobile datasets
for price prediction. We select a model to use for our system after comparisons. We select and
use the LightGbm regression model for our personalized support retrieval prediction system. The
LightGbm regression model achieved a price prediction accuracy of .97 within our regression
evaluation results with cross-validation. Furthermore, we record permutation feature importance
scores within our system to signify how feature importance scores differ after predictive
learning. The system displays these rankings by car results to the user. This allows the user to
learn about how different features influence the prediction of prices. This gives the user insight
into the structure of importance by showing users low, medium, and high rankings for vehicle
features that influence price predictions.


LUYAO WANG

Chair: Dr. Kelvin Sung
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Thesis: Real-Time Hatch Rendering

Hatching has been a common and popular artistic drawing style for centuries. In computer graphics rendering, hatching has been investigated as one of the many Non-Photorealistic Rendering solutions. However, existing hatch rendering solutions are typically based on simplistic illumination models, and real-time 3D hatch-rendered applications are rarely seen in interactive systems such as games and animations. This project studies the existing hatch rendering solutions, identifies the most appropriate one, develops a real-time hatch rendering system, and improves upon existing results in three areas: support general illumination and hatch tone computation related to observed artistic styles, unify spatial coherence support for Tonal Art Maps and mipmaps, and demonstrate support for animation.
 
The existing hatch rendering solutions can be categorized into texture-based and primitive-based methods. These solutions can be derived in object or screen space. Based on our background research, we chose to examine the texture-based object-space method presented by Praun et al. The approach inherits the advantage of object-space temporal coherence. The object-space spatial incoherence is addressed by the introduction of the Tonal Art Map (TAM). The texture-based solution ensures that the rendering results resemble actual artists' drawings.
 
The project investigated the solution proposed by Praun et al. based on two major components: TAM generation as an off-line pre-computation and real-time rendering via a Multi-Texture Blending shader.

The TAM construction involves building a two-dimensional structure, vertically to address spatial coherence as projected object size changes and horizontally to capture hatch tone changes. This unique structure enables the support for smooth transitions during zoom and illumination changes. We have generalized the levels in the vertical dimension of a TAM to integrate with results from traditional mipmaps to allow customization based on spatial coherence requirements. Our TAM implementation also supports the changing of hatch styles such as 90-degree or 45-degree cross hatching.

The Multi-Texture Blending shader reproduced the results from Praun et al. in real time. Our rendered results present objects with seamless hatch strokes and appear natural and resemble those of hand-drawn hatch artwork. Our implementation integrated and supported interactive manipulation of effects from general illumination models including specularity, light source types, variable hatch and object colors, and rendering of surface textures as cross hatch. Additionally, we investigated trade-offs between per-vertex and per-fragment tone computation and discovered that the smoothness in hatching can be better captured in the per-vertex computation with the lower sampling rate and interpolations. Finally, the novel integration of TAMs and traditional mipmaps allow customizable spatial coherence support which allows smooth hatch strokes and texture transitions in animations during object size and illumination changes.

Monday, December 5

JONATHAN LEE

Chair: Dr. Dong Si
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: Psychosis iREACH: Reach for Psychosis Treatment using Artificial Intelligence

Psychosis iREACH aims to optimize the delivery of evidence-based cognitive behavioral therapy to family caregivers who have a loved one with psychosis. It is an accessible digital platform that can utilize the user's intent and entities to determine the appropriate response. The platform is implemented based on an artificial intelligence and natural language understanding (NLU) framework, RASA. We developed the web application of the platform, and the chatbot has been integrated into the platform to collect data and evaluate performance. The link to the website is https://psychosisireach.uw.edu/. 

Thursday, December 8

MEGHNA REDDY

Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: Audio Classifier for Melodic Transcription in Language Documentation  and Application (MeTILDA)

There are about 7000 languages around the world, and 42% of these languages are considered endangered due to the decline in the number of speakers of the language. Blackfoot is one such language and is primarily spoken in Northwest Montana and Southern Alberta. MeTILDA (Melodic Transcription in Language Documentation and Application) is a collaborative platform created for researchers, teachers, and students to interact, teach, and learn endangered languages. It is currently being developed based on the Blackfoot language. Deep learning has progressed rapidly in the field of audio classification and has shown potential to serve as a tool for linguistic researchers in documenting and analyzing endangered languages. This project focuses on creating a web application for researchers of Blackfoot to assist their continuous research efforts in supporting deep learning research for MeTILDA. This application focuses on the automatic classification of different sounds in Blackfoot, specifically vowels and consonants, and provides three main functionalities. The dataset preparation section allows the user to create datasets of vowels and consonants easily and reduces manual effort. The feature extraction section allows the user to extract their choice of audio features such as Mel-Frequency Cepstral Coefficients, spectrograms, and spectral features for further processing and re-training models, and the audio classifier section allows the user to automatically obtain instances of vowels and consonants in user-provided audio files within Blackfoot language. The audio classifier uses an optimized ANN with audio spectral features and Mel-Frequency Cepstral Coefficients as input features and provides an accuracy of 89%. This application lowers the manual efforts and time-intensive tasks for researchers of Blackfoot and can be extended to classify other sounds in the future.

Friday, December 9

CAROLINE TSUI

Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.; Online
Project: Agent-based Graph Applications in MASS Java and Comparison with Spark

Graph theory is constantly evolving as it is applied to mathematics, science, and technology, and it has active applications in communication networks, computer science (algorithms and computation), and operations research (scheduling). Research on the realization and optimization of graph theory is of great significance to various fields. However, due to the increasing size of databases today, the volume of datasets (which can be represented as graphs for graph theory analysis) in real academia and industry has reached the level of petabytes (1,024 terabytes) or level of exabytes (1,024) petabytes. Analyzing and processing massive graphs has become a principal task in different fields. It is very challenging to process and compute rapidly growing, huge datasets (graphs) in a reasonable amount of time with limited computing and memory resources. In order to meet the need of improving performance, some solutions have emerged one after another - parallel frameworks that support graph computing. However, it is unclear how these parallelization libraries differ in performance and programmability for graph theory research or graph application development. The goal of this project is to compare the performance and programmability of two parallel libraries, MASS Java developed by the DSLab at the University of Washington Bothell, and Spark (including Spark GraphX) developed by the AMPLab at the University of California Berkeley, for graphics programming. In order to balance performance and programmability, we used MASS Java and Spark to design and develop Graph Bridge, Minimum Spanning Tree, and Strongly Connected Components respectively, for a total of six graph applications. After three rounds of running the applications and comparing their performance, the results show that for the Graph Bridge application, the performance of Spark is slightly better than that of MASS Java, and for Minimum Spanning Tree and Strongly Connected Components applications, MASS Java performs slightly better. Because MASS Java provides agents, they can more flexibly handle vertex-based regional operations and pass data to other agents; but Spark is not an agent-based library. However, for Graph Bridge applications that require depth-first traversal to obtain results, the agent advantage of MASS cannot be reflected. To measure programmability, we perform quantitative and qualitative evaluations respectively. The results show that the programmability of the two libraries is similar, but from the user's point of view, MASS Java is more intuitive and suitable for developing graphical applications.


CHRIS LEE

Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.; Online
Project: Extending a Cloud-Based System for Endangered Language Analysis and Documentation

With 40% of the world’s 7000 languages considered endangered, there is a significant need to document and analyze these languages to preserve the language and its associated culture and heritage. Blackfoot, spoken by approximately 2800 speakers in Alberta, Canada and Montana in the United States, is one such endangered language. Classified as a pitch accent language, the meaning of a Blackfoot word is not based exclusively on the spelling of the word but on the pitch patterns of the spoken word, which makes it challenging to teach and learn. To overcome this challenge and aid in the revitalization of the Blackfoot language, we collaborated with researchers at the University of Montana in developing a cloud-based system known as MeTILDA (Melodic Transcription in Language Documentation and Application). The goal of this project is to modernize the technologies originally used in the MeTILDA system, extend its analytic capabilities to incorporate the study of rhythm, and improve its data reuse and collaboration capability by persisting the data used in creating the visual aids called Pitch Art. The proposed features will benefit linguistic researchers in furthering their understanding of the Blackfoot language. It will facilitate teachers in developing curriculum for language acquisition, and help students take advantage of a teachers guided learning plan. With this system, we aim to provide an extensible platform to support future development to support the documentation and preservation of other endangered languages.

SUMMER 2022

Monday, August 1

JUNJIE LIU 

Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: COVID-19 Fake News Detector

COVID-19, caused by a coronavirus called SARS-Cov-2, has triggered a pandemic impacting people’s everyday life for more than two years. With the fast spreading of online communication and social media platforms, the number of fake news related to COVID-19 is in rapid growth and propagates misleading information to the public. To tackle this challenge and stop the spreading of fake news regarding COVID-19, this project proposes to build an online software detector specifically for COVID-19 news to classify whether the news is trustworthy. Specifically, the intellectual contributions for this project are summarized below:

  1. This project specifically focuses on fake news detection for COVID-19 related news. In general, it is difficult to train a generic model for all domains, the general practice is to fine-tune a base model to adapt the specific domain context.
  2. A data collection mechanism to obtain fresh COVID-19 fake news data and to keep the model fresh.
  3. Performance comparisons between different models: traditional machine learning models, ensemble machine learning models, and state-of-the art models – Transfer models.
  4. From engineering perspective, the project will be the first online fake news detection website to focus on COVID-19 related fake news. 

ANDREW NELSON

Chair: Dr. Kelvin Sung
Candidate: Master of Science in Computer Science & Software Engineering
5:45 P.M.; Online
Project: Real-time Caustic Illumination 

Caustic illumination is the natural phenomenon that occurs when light rays bend as they pass through transparent objects and focus onto receiver objects. One might notice this effect on the ocean floor as light rays pass through the water and focus on the floor. Rendering this effect in a simulated environment would provide an extra touch of realism in applications that are meant to fully immerse a user in the experience. Traditionally, caustic illumination is simulated with offline ray tracing solutions that simulate the physical phenomenon of transporting photon particles through refraction and depositing the results on the receiving object. While this approach can yield accurate results, it is computationally intensive, and these ray tracing solutions can only be rendered in batches. To support caustics in real time, the calculations must simulate the natural phenomenon of photons traveling through transparent objects in every rendering frame without slowing down the application. This project focuses on rendering caustics in real time using a multi-pass rendering solution developed by Shah et al. Their approach constructs a caustic map in every frame which is used by subsequent rendering frames to create the final effect. The goal of this project was to develop an application that renders caustics and supports user interaction in real time. Our implementation uses the Unity game engine to successfully create the desired effect while maintaining a minimum frame rate of thirty frames per second.

Thursday, August 4

KALUAD ABDULBASET SANYOUR

Chair: Dr. Wooyoung Kim
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: The Role of Machine Learning Algorithms in Editing Genes of Rare Diseases

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR), is an adaptive immunity mechanism in prokaryotes. Scientists have discovered that it is a programmable system that could be used to edit the genes of various species, which allows us to edit genes causing some rare diseases. CRISPR is associated with Cas9 protein causing double-stranded breaks in DNA. Cas9 binds to a gRNA that guides the Cas9 to a specific site that can be edited. Although gRNA is versatile and easy to design, it lacks accuracy in determining the editable sites. This can misguide Cas9 to a wrong location, causing changes in different genes. Hence, CRISPR process needs to find the ideal gRNA that can guide Cas9 to on-target, and avoid off-target. Various machine learning (ML) algorithms can play an important role in evaluating gRNAs for the CRISPR mechanism, and recently many computational tools have been developed to predict the cleavage efficiency of gRNA design process. Here, the project aims to provide an overview and comparative analysis of various machine and deep learning (MDL)-based methods that are effective in predicting CRISPR gRNA on-target activities. Comparison results show that hybrid approach combining deep learning and other ML algorithms presented excellent results.

Monday, August 8

SYED ABDULLAH

Chair: Dr. David Socha
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Remote Onboarding for Software Engineers:  From “Forming” to “Performing”

Onboarding is defined as the process when a new employee joins, learns about, integrates into and becomes a contributing member of a team. A successful onboarding is essential for moving a team from Forming to Performing stage. It helps increase the new hire’s job satisfaction, improve the team’s performance, and reduce turnovers (which bring the team back to the forming stage). With remote work being the new norm in software engineering, remote onboarding brings a unique set of challenges.

In this project, I aim to identify the main challenges faced during remote onboarding for software engineers, specifically for role-specific onboarding that happens in the team domain, and provide recommendations on improving this onboarding process. To achieve these aims, I conducted a qualitative interview study and activity exercise with software engineers who have gone through remote onboarding. Nine interviews were conducted with software engineers ranging from junior software engineers to senior software engineers and software engineering managers. I analyzed these interviews to gain insights into factors affecting onboarding. From the interviews, I identified a hierarchy of needs, in which I classified the needs of the new hire into basic needs and needs required for excellence. Needs such as access to tools, clarity of tasks and knowledge were categorized as basic needs to do the work, whereas mentorship, relationship building, and collaboration transform the onboarding into an excellent experience. I then further linked these needs to 5 main themes that emerged during the interviews for having an effective onboarding: (i) having an effective onboarding buddy; (ii) the ability to create relationships with team members and other stakeholders; (iii) being provided with up to date and organized documentation and onboarding plan; (iv) the manager's ability to listen and adapt to remote needs; and (v) a team culture which enables team members to communicate effectively and get unblocked quickly. Based on the interviews’ analysis together with insights from the literature, I developed checklists for recommended best practices for effective onboarding. A checklist was developed for each of the main onboarding stakeholders i.e., manager, onboarding buddy and new hire, along with a template of an onboarding plan. Using these checklists will help improve the effectiveness and consistency of remote onboarding for software engineering new hires.

Tuesday, August 9

ASHWINI ARUN RUDRAWAR

Chair: Dr. Michael Stiber
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.; Online
Thesis: Evaluating the impact of GPU API evolution on application performance and software quality

Researchers and engineers have started to build parallel computing systems using sequential CPU + parallel GPU programs. In recent years, there has been an increasing number of hardware GPU devices available in the market along with a number of software solutions that support these hardware devices. A substantial amount of work is required in identifying the best combination of hardware and software for building heterogeneous solutions. One of the combinations developers use is NVIDIA GPUs and CUDA APIs. With the rapid architectural changes in GPU hardware, the related functioning of APIs also changes. There is considerable regression in the development of applications built using prior versions of APIs due to backward compatibility limitations. This thesis evaluates the evolution of NVIDIA GPU and CUDA APIs with the help of Graphitti, a graph based heterogeneous CUDA/C++ simulator.  This thesis identifies the advantages, limitations, and underlying functioning of a subset of APIs. This research explores these APIs in the context of performance, compatibility, ease of development, and code readability. It discusses how this process helped to implement a software change compatible with the simulator. This thesis documents the implementation of two APIs, ‘separate compilation’ and ‘link time optimization’ on the simulator, and how the implementation will help users to write modular code in Graphitti.  It also shows there is almost no performance overhead over one of the largest neural network simulations in Graphitti. The implementation offers flexibility and scope to enhance the heterogeneous nature of Graphitti which will help to simulate much larger networks.

Wednesday, August 10

ANDREW HITOSHI NAKAMURA

Chair: Dr. Dong Si
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Thesis: Macromolecular Modeling: Integrating DNA/RNA into DeepTracer's Prediction Pipeline

DeepTracer is a fully automatic deep learning-based method for fast de novo multi-chain protein complex structure determination from high-resolution cryoelectron microscopy (cryo-EM) density maps. The macromolecular pipeline extends DeepTracer’s functions by including a segmentation step and pipeline steps to predict nucleic acids from the density. Segmentation uses a Convolutional Neural Network (CNN) to separate the densities of the two types of macromolecules, amino acids and nucleotides. Two U-Nets are trained to predict amino acid and nucleotide atoms in order to predict the structure from the density data. The nucleotide U-Net was trained with a map sample size of 163 cryo-EM maps containing nucleotide density, and identifies phosphate, sugar carbon 4 and sugar carbon 1 atom positions. When compared to Phenix’s pipeline, amino acids show favorable RMSD metrics, and nucleotide show comparable phosphate and nucleotide correlation coefficient (CC) metrics. The trained nucleotide U-Net model primarily focuses on double stranded DNA/RNA. Future work involves utilizing more density map data in training the nucleotide U-Net to detect single stranded DNA/RNA and removing phosphate outliers in postprocessing to improve the nucleic acid prediction.


ALEX XIE

Chair: Dr. Yang Peng
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.; Online
Project: Improving the Quality of Inference during Edge Server Switch for Applications using Chained DNN Models

Recent advances in deep neural networks (DNN) have substantially benefited intelligent applications, for example, real-time video analytics. More complex DNN models typically require a more robust computing capacity. Unfortunately, the considerable computation resource requirements of DNNs make the inference on resource-constrained mobile devices challenging. Edge intelligence is a paradigm solving this issue by offloading DNN inference tasks from mobile devices to more powerful edge servers. Due to user mobility, however, one challenge for such mobile intelligent services is maintaining the quality of service during the handover between edge servers. To address this problem, we propose in this report a solution to help improve the quality of inference for real-time video analytics applications that use chained DNN models. The scheme comprises two sub-schemes: (1) a non-handover scheme that determines the optimal offloading decisions with the shortest end-to-end inference latency, and (2) a handover scheme that improves the inference quality by maximizing the usage of mobile devices for the most useful inference outcomes. We evaluated the proposed scheme using a DNN-based real-time traffic monitoring application via testbed and simulation experiments. The results show that our solution can improve the inference quality by 57% during handovers compared to a greedy algorithm-based solution.

Thursday, August 11

MICHAEL J. WAITE

Chair: Dr. William Erdly
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Mobile-Ready Expression Analysis

The field of computerized facial expression analysis has grown fast in recent years, with multiple commercial solutions and a plethora of research being produced. However, there has not been much focus on this technology's use in disability assistance. Studies have shown that an inability to read facial expressions can have a drastic negative impact on a person's life, presenting a clear need for tools to help those impacted. Most work in this field focuses on analytic performance over computational performance. This project aims to create an application that can be used by the disabled to read facial expressions in situations where they cannot, with a focus on computational performance to allow for real-time analysis. By utilizing a simplified methodology inspired by classic object detection such as SIFT and SURF, we found that our emotional analysis can achieve a computational performance of 100 milliseconds per image while retaining an overall accuracy of 64% when evaluated on the CK+ database. We hope that in the future our system can be further developed to produce greater accuracy with minimal loss in computational performance using machine learning.

SPRING 2022

Wednesday, May 4

DAT TIEN LE

Chair: Dr. Brent Lagesse
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Thesis: Emulated Autoencoder:  A Time-Efficient Image Denoiser for Defense of Convolutional Neural Networks against Evasion Attacks

As Convolutional Neural Networks (CNN) have become essential to modern applications such as image classification on social networks or self-driving vehicles, evasion attacks targeting CNNs can lead to damage for users. Therefore, there has been a rising amount of research focusing on defending against evasion attacks. Image denoisers have been used to mitigate the impact of evasion attacks; however, there is not a sufficiently broad view of the use of image denoisers as adversarial defenses in image classification due to a lack of trade-off analysis. Thus, trade-offs, including training time, image reconstruction time, and loss of benign F1-scores of CNN classifiers, of a group of image denoisers are explored in this thesis. Additionally, Emulated Autoencoder (EAE), which is the proposed method of this thesis to optimize trade-offs for high volume classification tasks, is evaluated alongside state-of-the-art image denoisers in both the gray-box and white-box threat model. EAE outperforms most image denoisers in both the gray-box and white-box threat models while drastically reducing training and image reconstruction time compared to the state-of-the-art denoisers.  As a result, EAE is more appropriate for securing high-volume classification applications of images.

Wednesday, May 18

NIRALI GUNDECHA

Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.; Online
Project: Lambda and Reduction method implementation for MASS Library

The MASS is a parallelizing library that provides multi-agent and spatial simulation over a cluster of computing nodes. The goal of this capstone is to reduce the communication overhead for data and make the user experience effortless. Hence improving the efficiency of MASS.

This paper introduces new features, lambda and reduction methods, and implementation to achieve the goals. This feature is not implemented and provided to any agent-based library till the date. Hence making this sole contribution to agent-based library. This paper validates the lambda and reduce method and uses MASS library to do so.

Implementation of the lambda method library and provides users the flexibility of using the MASS library frictionlessly. Using lambda methods, user can describe their own new feature implementation on the fly and have results instantaneously. On top of the lambda feature, reduce method is responsible to perform reduce operation of any type of users' data or Agent data. The operation user wants to perform can be anything such as max, min or sum.

The data collection method is described as a lambda method. Using reduce method, user can perform tasks of reduction in single line of code that improves code reliability and clean code. These features remove the hassle of writing blocks of code and getting involved into agents’ behavior over cluster of nodes is distinctive as well as innovative. Lambda and reduce method implementation are revolutionary as this is unique contribution to agent-based library and their users.


PALLAVI SHARMA

Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: Text Synthesis

With explosion of data in the digital domain, manual synthesis of long texts to extract important information is quite a laborious and time-consuming task. Mobile based text synthesis systems that can take text input and extract important information can be very handy and would reduce the overall time and effort required in manual text synthesis. In this work, a novel system is developed that facilitate users to extract summaries and keywords from long texts in real time using a cross-platform mobile application. The mobile application uses a hybrid approach based on feature extraction and unsupervised learning for generating quality summaries. In this paper, 10 sentence features are used for feature extraction. A hybrid technique based on machine learning with semantic methods is used to extract keywords/key-phrases from the source text. This application also allows users to manage, share and listen to the information extracted from the input text. Additional features like allowing users to draft-error free notes improve users’ experience. To test reliability of this system, experimental and research evaluation were carried out on DUC 2002 dataset using ROGUE parameters. Results demonstrate 51% F-Score which is higher than state of the art methods used for extractive summarization on the same dataset. The hybrid approach used for keyword/key-phrase extraction was tested from the validity of the resulting keywords. Application could produce proper keywords in the form of phrases and words with an accuracy of 70%. 

Thursday, May 19

ZHIYANG ZHOU

Chair: Dr. Afra Mashhadi
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.; Online
Project: Facial Recognition on Android Devices with Convolutional Neural Networks and Federated Learning

Machine Learning (ML) and Artificial Intelligence (AI) are widely applied in many modern services and products we use. Facial Recognition (FR) is a powerful ML application that has been used extensively in various fields. However, traditionally, the models are trained on photos crawled from the World Wide Web (WWW), and they are often biased towards celebrities and the caucasian population. Centralized Learning (CL), one of the most popular training techniques, requires all data to be on the central server to train ML models. However, it comes with additional privacy concerns as the server takes ownership of end-user data. In this project, we first use Convolutional Neural Networks (CNN) to develop an FR model that can classify 7 demographic groups using the FairFace image dataset. This has a more balanced and diverse distribution of ordinary face images across the racial groups. To further extend the training accessibility and protect sensitive personal data, we propose a novel Federated Learning (FL) system using Flower as the backend and Android phones as edge devices. These pre-trained models are initially converted to TensorFlow Lite models, which are then deployed to each Android phone to continue learning on-device from additional subsets of FairFace. Training takes place in real-time and only the weights are communicated to the server for model aggregation, thus separating user data from the server. In our experiments, we explore various centralized model architectures to achieve an initial accuracy of 52.9%, which is lightweight enough to continue improving to 68.6% in the Federated Learning environment. Application requirements on Android are also measured to validate the feasibility of our approach in terms of CPU, memory, and energy usage. As for future work, we hope the system can be scaled to enable training across thousands of devices and have a filtering algorithm to counter adversarial attacks.

Friday, May 20

VISHNU MOHAN

Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.; Online
Project: Automated Agent Migration Over Structured Data

Agent-based data discovery and analysis views big-data computing as the results of agent interactions over the data. It performs better onto a structured dataset by keeping the structure in memory and moving agents over the space. The key is how to automate agent migration that should simplify scientists’ data analysis. We implemented this navigational feature in multi-agent spatial simulation (MASS) library. First, this paper presents eight automatic agent navigation functions, each we identified, designed, and implemented in MASS Java. Second, we present the performance improvements made to existing agent lifecycle management functions that migrate, spawn and terminate agents. Third, we measure the execution performance and programmability of the new navigational functions in comparison to the previous agent navigation. The performance evaluation shows that the overall latency of all the four benchmark applications improved with the new functions. Programmability evaluation shows that new implementations reduced user line of codes (LOC), made the code more intuitive and semantically closer to the original algorithm. The project successfully carried out two goals: (1) design and implement automatic agent navigation functions and (2) make performance improvements to the current agent lifecycle management functions.


CARL ANDERS MOFJELD

Chair: Dr. Yang Peng
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Adaptive Acceleration of Inference Services at the Network Edge

Deep neural networks (DNN) have enabled dramatic advancements in applications such as video analytics, speech recognition, and autonomous navigation. More accurate DNN models typically have higher computational complexity. However, many mobile devices do not have sufficient resources to complete inference tasks using the more accurate DNN models under strict latency requirements. Edge intelligence is a strategy that attempts to solve this issue by offloading DNN inference tasks from end devices to more powerful edge servers. Some existing works focus on optimizing the inference task allocation and scheduling on edge servers to help reduce the overall inference latency. One key aspect of the problem is that the number of requests, the latency constraints they have, and network connection quality will change over time. These factors all impact the latency budget for inference computation. As a result, the DNN model that maximizes inference quality while meeting latency constraints can change as well. To address this opportunity, other works have focused on dynamically adapting the inference quality. Most such works, though, do not solve the problem of how to allocate and schedule tasks across multiple edge servers, as the former group does.  In this work, we propose combining strategies from both areas of research to serve applications that use deep neural networks to perform inference on offloaded video frames. The goals of the system are to maximize the accuracy of inference results and the number of requests the edge cluster can serve while meeting latency requirements of the applications. To achieve the design goals, we propose heuristic algorithms to jointly adapt model quality and route inference requests, leveraging techniques that include model selection, dynamic batching, and frame resizing. We evaluated the proposed system with both simulated and testbed experiments. Our results suggest that by combining techniques from both areas of research, our system is able to meet these goals better than either approach alone.

Monday, May 23

ISHPREET TALWAR

Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
9:00 A.M.; Online
Project: Recycle Helper - A Cross-Platform mobile application to Aid Recycling

With the growth of the population on the planet, the amount of waste generated has also increased. Such waste, if not handled correctly, can cause environmental issues. One of the solutions to this problem is Recycling. Recycling is the process of collecting and processing materials that would otherwise be thrown away as trash and turning them into new products. It can benefit the community and the environment. Recycling can be considered as an umbrella term for the 3R’s - Reduce, Reuse and Recycle. There are a variety of items that are present in the surrounding environment in different states/conditions which makes the process of recycling complex because having the knowledge of the correct way to recycle these items can be overwhelming and time-consuming. To help solve this problem to an extent, this paper proposes a cross-platform mobile application that promotes recycling. It helps users by providing them with recycling instructions for different product categories. The application allows the user to capture/choose an image of an item using a phone camera or gallery. It uses software engineering methodologies and machine learning to predict the item and provide the relevant recycle instructions. The application is able to detect and predict the items with an accuracy of 81.06%, using a Convolutional Neural Network (CNN) model.  To motivate and engage users for recycling, the application allows the user to set a monthly target goal for recycling, track its progress, and view their recycling history. The application is user-friendly and will help promote correct recycling in a less time-consuming manner.

Wednesday, May 25

YAN HONG

Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.; Online
Project: Graph Streaming in MASS Java

This project is to facilitate graph streaming in agent-based big data computing where agents find a shape or attributes of a huge graph. Analyzing and processing massive graphs in general has become an important task in different domains because many real-world problems can be represented as graphs such as biological networks and neural networks. Those graphs can have millions of vertices and edges. It is quite challenging to process such a huge graph with limited resources as well as in a reasonable timeframe. The MASS (Muti-Agent Spatial Simulation) library has already supported graph data structure (GraphPlaces) which is distributed on a cluster of computing nodes. However, when processing a big graph, we may still encounter the following two problems. The first is the construction overhead that will delay the actual computation. The second is limited resources that slow down graph processing. To solve those two problems, we implemented graph streaming in MASS Java which repetitively reads a portion of a graph and processes it while reading the next graph portion. It supports HIPPIE and MATSim file formats as the input graph files. We also implemented two graph streaming benchmarks: Triangle Counting and Connected Components, to verify the correctness of and evaluate the performance of graph streaming. Those two programs were executed with 1 - 24 computing nodes, which demonstrates the significant CPU-scalable and memory-scalable performance improvements. We also compared the performance with the non-streaming solution. Graph streaming avoids the explosive growth of the agent population and loads only a small portion of a graph, both efficiently using limited memory space.

Thursday, May 26

BRETT BEARDEN

Chair: Dr. Erika Parsons
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.; Online
Project: Redesigning the Virtual Academic Advisor System, Backend Optimizations, and Implementing a Python and Machine Learning Engine

Community College students have admiration of continuing their education at a 4-year college or university. The process of navigating college can be complex, let alone figuring out transfer requirements for individual schools. Assisting students in this process requires special knowledge for specific departments and majors. Lower budgeting colleges do not have funds for additional staff regarding academic advising, and the task gets passed to the teaching faculty.  Student academic planning is a time-consuming process that can detract from an instructor’s time needed to focus on their current courses and students. For years, a team of students at the University of Washington Bothell have been working on a Virtual Academic Advisor (VAA) system to automate the process of generating student academic plans in support of Everett Community College (EvCC). The goal of the VAA system is to reduce the amount of time an instructor sits with an individual student during academic advisement. However, the VAA system is not yet complete and there were a few roadblocks preventing it from moving forward. The work proposed in this capstone focusses on redesigning the previous VAA system to remove fundamental flaws in how data is stored related to scheduling academic plans. A new system architecture will be designed allowing to conduct backend optimizations. Cross-language support will give the VAA system the ability to communicate with Python for conducting machine learning research. The proposed work brings the VAA system closer to completion and ready for deployment to support EvCC. 


SANA SUSE

Chair: Dr. Clark Olson
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.; Online
Project: Classifying Urban Regions in Satellite Imagery Using the Bag of Words Methodology 

Satellite imagery has become more accessible over the years in terms of both availability and in quality, though the analysis of such images has not kept up at the same pace. To investigate the analysis process, this work explores the detection of urban area boundaries from satellite imagery. The ground truth values of these boundaries were collected from the U.S. Census Bureau’s geospatial urban area dataset and were used to train a classification model using the Bag of Words methodology. During training and testing, 1000x1000 pixel patches were used for classification. The resulting classification accuracy was between 85-90% and showed that urban areas were classified with higher confidence than non-urban areas. Most of the sub-images that were classified with a lower confidence are in the transition areas between urban and non-urban areas. In addition to low confidence in classifying these transition areas, these patch sizes are quite large. For this reason, they are not helpful to delineate granular details in the urban area boundaries.  


TIANHUI NIE

Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Visualization of 2D Continuous Spaces and Trees in MASS Java

MASS is an Agent-Based Modeling (ABM) library. It supports parallelized simulation over a distributed computing cluster. The Place objects in these simulations can be thought of as the environment where agents interact with each other. Places can mimic different data structures to simulate various interaction environments, such as graphs, multi-dimensional arrays, trees, and continuous spaces.

However, the continuous spaces and trees are usually complex for programmers to debug and verify. So, this project will focus on how to visualize these data structures visually. These data structures are available in the MASS library. They can be instantiated at InMASS which enables Java’s JShell interface to execute codes line by line in an interactive fashion. InMASS has also facilitated additional functionalities including checkpoint, and rollback. These functionalities can help programmers to view their simulations better. MASS allows Places and agents to be transferred to the Cytoscape for their visualization. Cytoscape is an open-source network visualization tool initially developed to analyze biomolecular interaction networks. Expanded Cytoscape MASS plugins can build a MASS control panel on the Cytoscape application. It helps users to visualize graphs, continuous spaces, and trees at Cytoscape.

This project successfully realized the visualization of MASS binary trees, quad trees, and 2D continuous spaces with Cytoscape. It also enhanced MASS-Cytoscape integration and optimized the MASS control panel. From this project, these data structure visualizations provide an easier way for other users to learn the MASS library and debug their codes.

Friday, May 27

MARÉ SIELING

Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: AGENT-BASED DATABASE WITH GIS

Geographic Information Systems (GIS) create, manage, analyse and maps data.  These systems are used to find relationships and patterns between different pieces of data in a geographically long distance.  GIS data can be extremely large and analysing the data can be laborious while consuming a substantial amount of resources.  By distributing the data and processing it in parallel, the system will consume less resources and improve performance.

The Multi-Agent Spatial Simulation (MASS) library applies agent-based modelling to big data analysis over distributed computing nodes through parallelisation.  GeoTools is a GIS system that is installed on a single node and processes data on that node.  By creating a distributed GIS from GeoTools with the MASS library, results are produced faster and more effectively than traditional GIS systems located on a single node. 

This paper discusses the efficacy of coupling GIS and MASS through agents that render fragments of feature data as layers on places, returning the fragments to be combined for a completed image.  It also discusses distributing and querying the data, returning results by running a query language (CQL).  Image quality is retained when panning and zooming without major loss of performance by rerendering visible sections of the map through agents and parallelisation.  Results show that coupling GIS and MASS significantly improves the efficiency and scalability of a GIS system.


LIWEN FAN

Chair: Dr. Kelvin Sung
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Realistic Fluid Rendering in Real-Time

Real-time realistic fluid rendering is important because fluid is ubiquitous and can be found in many Computer Generated Imagery (CGI) applications, such as video games and movies. However, realism in fluid rendering can be complex due to the fact that fluid does not have a concrete physical form or shape. There are many existing solutions in modeling the movement and the appearance of fluid. The movement of fluid focuses on simulating motions such as waves, ripples, and dripping. The appearance, or rendering, of fluid aims to reproduce the physical illumination process to include effects including reflection, refraction, and highlights. Since these solutions focus on addressing different aspects of modeling fluid, it is important to clearly understand application requirements when choosing among these.

This project focuses on the appearance, or the rendering, of fluid. We analyze existing solutions in detail and adopt the solution which is most suitable for real-time realistic rendering. With a selected solution, we explore implementation options based on modern graphics hardware. More specifically, we focused on graphics hardware that can be programmed through popular interactive graphical applications for the reasons of supporting interactive modeling, high-level shading language, and fast turnaround debugging cycles. The solution proposed by Van Der Laan et al., in their 2009 I3D research article is the choice of solution for this project. Our analysis shows that their approach is the most suitable because of the real-time performance, high-quality rendered results, and very importantly, provided implementation details.

The graphics system and hardware evaluation led to the Unity game engine. This is our choice of implementation platform due to its friendly interactive 3D functionalities, high-level shading language support, and support for efficient development cycles. In particular, the decision is based on Unity’s support of Scriptable Render Pipeline (SRP) functionality where the details of an image generation process can be highly customized. The SRP offers flexibility with ease of customizing shaders, and control of number of passes in processing the scene geometry for each generated image. In our implementation, the SRP is configured to compute the values to all of the parameters in the fluid model via separate rendering passes.

Our implementation is capable of rendering fluid realistically in real-time, where the users have control over the actual fluid appearance. The delivered system supports two types of simple fluid motion: waves and ripples. The rendered fluid successfully captures effects from the intrinsic color of the fluid under Fresnel reflection, the reflection of environmental elements, and, highlights from the light sources. In addition, to provide users with the full control on the rendered results, a friendly interface is supported. To demonstrate the system, we have configured to showcase our fluid rendering of some common conditions including swimming pool, muddy pond, green algae creek, and colored fluid in a flowery environment.

Wednesday, June 1

YILIN CAI

Chair: Dr. Brent Lagesse
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: Model Extraction Attacks Effectiveness And Defenses

Machine learning is developing quickly in data industry and many technology companies who have the resources to collect huge datasets and train models are starting to provide services of pre-trained models for profits. The cost of training a good model for business use is expensive because huge training datasets may not be easily accessible and training the model itself requires a lot time and effort. The increased value of a pre-trained model motivates attackers to conduct model extraction attack, which focus on extracting valuable information from the target model or construct a clone close to the target model for free use by only making queries from the victim. The goal of this experiment is exploring the vulnerability of proposed model extraction attacks and evaluating the effectiveness of the attack by comparing the attack results when the victim model and its target datasets are more complex. We first construct datasets for the attacks by making queries to the victim model and some attacks propose to have certain strategies of selecting queries. Then, we execute the attack either by running it from scratch or using existing test framework. We run the attack with different victim models and datasets and compare the attack results. The results show that the attacks which extract information from a model are effective on simpler models but not on more complex models, and the difficulty of making a cheaper clone model will increase and the attacker may need more knowledge besides query info from the victim when the victim model and its target datasets are more complex. Potential defenses and their weakness are also discussed after the experiment. 


PRATIK GOSWAMI

Chair: Dr. William Erdly
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Virtual Reality based Orthoptics for Binocular Vision Disorders

The Center for Disease Control noted that approximately 6.8% of children under the age of 18 years in the United States are diagnosed with vision problems significant enough to impact learning. Binocular Disorders can lead to headaches, blurry vision, double vision, loss of coordination, fatigue, and the inability to track objects, thereby severely impacting a child’s ability to learn. Without intervention, vision problems can lead to suppression of vision in the affected eye. Vision Therapy or Orthoptics is meant to help individuals recover their eyesight. It  aims to retrain the user to achieve Binocular Fusion using therapeutic exercises. Binocular Fusion refers to the phenomenon of perceiving a single fused image when presented with 2 images in each eye. Virtual Reality (VR) shows a lot of potential as an orthoptics medium. VR headsets can isolate the user from the physical world, reduce real world distractions, provide a dichoptic display where each eye can be presented with a different input, and provide a customized therapy experience for the user.

Although several VR applications exist with a focus on orthoptics, clinicians report that these applications fail to strike a balance between therapy and entertainment. These applications can be too entertaining for the user and thus distract them from the therapy goals.

As a part of the EYE Research Group, I have developed 2 applications which when added to the previously developed applications make a VR toolkit to provide vision therapy to individuals diagnosed with Binocular Disorders.  Each application in the toolkit focuses on a level of Binocular Fusion. The 2 applications I developed focuses on the third and fourth level of fusion – Sensory Fusion and Motor Fusion. The project has been successfully developed using Unity Game Engine along with the Oculus VR plugin. All decisions about the controls and features have been made after the analysis of the feedback and interview of the therapists at the EYE See Clinic. Key design decisions have also been the outcome of the demonstration and trial of the prototypes at the ACTION Forum 2021. The forum was attended by therapists, students and researchers in the field of orthoptics.

Although the applications have been successfully developed and have been approved by the therapists at the EYE See Clinic, a clinical study is required to test the usability and the effectiveness of the tools as a therapy tool. As of May 16th, 2022, all applications have been successfully developed, tested, and approved by Dr. Alan Pearson, the clinical advisor to the EYE Research Group. A case study was proposed, reviewed and approved by the UW IRB and the UW Human Subjects Division (HSD) board. The results of the study will be beneficial for future research.


FRANZ ANTHONY VARELA

Chair: Dr. Michael Stiber
Candidate: Master of Science in Computer Science & Software Engineering
5:45 P.M.; Online
Thesis: The Effects of Hybrid Neural Networks on Meta-Learning Objectives

Historically, models do not generalize well when they are trained solely on a dataset/task's objective, despite the plethora of data and computing available in the modern digital era. We propose that this is at least partially because the representations of the model are inflexible when learned in this setting; in this paper, we experiment with a hybrid neural network architecture that has an unsupervised model at its head (the Knowledge Representation module) and a supervised model at its tail (the Task Inference module) with the idea that we can supplement the learning of a set of related tasks with a reusable knowledge base. We analyze the two-part model in the contexts of transfer learning, few-shot learning, and curriculum learning, and train on the MNIST and SVHN datasets. The results of the experiment demonstrate that our architecture on average achieves a similar test accuracy as the E2E baselines, and sometimes marginally better in certain experiments depending on the subnetwork combination.


NHUT PHAN

Chair: Dr. Erika Parsons
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Thesis: Deep Learning Methods to Identify Intracranial Hemorrhage Using Tissue Pulsatility Ultrasound Imaging

Traumatic Brain Injury (TBI) is a serious medical condition when a person experiences trauma in the head, resulting in intracranial hemorrhage (bleeding) and potential deformation of head-enclosed anatomical structures. Detecting these abnormalities early is the key to saving lives and improving survival outcomes. Standard methods of detecting intracranial hemorrhage are Computed Tomography (CT) and Magnetic Resonant Imaging (MRI). However, they are not readily available on the battlefield and in low-income settings. A team of researchers from the University of Washington developed a novel ultrasound signal processing technique called Tissue Pulsatility Imaging (TPI) that operates on raw ultrasound data collected using a hand-held tablet-like ultrasound device. This research work aims to build segmentation deep-learning models that take the input TPI data and detect the skull, ventricles, and intracranial hemorrhage in a patient's head. We employed the U-Net architecture and four of its variants for this purpose. Results show that the proposed methods can segment the brain-enclosing skull and is relatively successful in ventricle detection, while more work is needed to produce a model that can reliably segment intracranial hemorrhage.

Friday, June 3

MONALI KHOBRAGADE

Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.; Online
Project: EcoTrip Planner – An Android App

The emergence of online travel websites like TripAdvisor, Priceline, Expedia, and KAYAK allowed users to get an experience of booking the accommodation online without any hassle with the agent. Users are no longer waiting in queues to get flight tickets to their favorite destinations. They can also get enough idea about the vacation destination over the online travel websites, which was earlier depended solely on the agent’s guidance. Users can book flights, hotels, and restaurants using these online websites. In short, using online travel websites, they can plan a vacation trip after manually evaluating all the options like price, flight timings and availability, hotel location, food options, and nearby locations to checkout. However, a recent study indicates that abundant options available in online travel agencies are overwhelming to users. The main challenge is that these online travel websites do not provide a holistic trip plan including flight and hotel accommodation under the user's budget. In this paper, we intend to provide a trip plan with flight travel and hotel stay suggestions under the user's given budget by using personalized factors and analyzing user experience. The aim of this project is to develop an android mobile application that will help users plan trips under a given budget and help fight information overload. Our approach in this application asks users about the vacation destination and the budget amount they can afford. It also asks users about their preferred hotel location, hotel stars, and ratings. It then analyzes the budget and uses heuristic models and natural language processing to recommend the best available travel and lodging. For travel, it suggests the round-trip plan from current location to destination, and for hotels, it suggests the top 3 hotels with a personalized user experience. This system also extracts the top 5 keywords from the hotel reviews. These keywords allow users to get an overall idea about the hotel. Our approach in this android application will help users to plan the trip including flight travel and hotel accommodation in minutes.


WILLIAM OTTO THOMAS

Chair: Dr. Erika Parsons
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.; Online
Thesis: Human Cranium, Brain Ventricle and Blood Detection Using Machine Learning on Ultrasound Data 

Any head related injury can be very serious and may be classified as a traumatic brain injury (TBI), which can be a result of intracranial hemorrhaging.  TBI is one of the most common injuries in or around a battlefield, which can be caused by both direct and indirect impacts. While assessing a brain injury in a well-equipped hospital is typically a trivial task, the same cannot be said about a TBI assessment in a non-hospital environment. Typically, a computer tomography (CT) machine is used to diagnose TBI. However, this project demonstrates the use of ultrasound and how it can be used to predict where skull, ventricles, and bleeding occur. The Pulsatility Research Group at the University of Washington has conducted three years of data collection and research to create a procedure that diagnoses TBI in a field situation. In this paper, machine learning methodologies will be used to predict these CT derived features.  The result of this research shows that with adequate data and collection methods skull, ventricles, and potentially blood can be detected while applying machine learning to ultrasound obtained data.  

WINTER 2022

Thursday, March 3

FANG-CHUN LIN

Chair: Dr. Kelvin Sung
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.; Online
Project: A Location-aware Grocery List Application

People are increasingly interested in fast and easy shopping at the supermarket. However, shopping for groceries can be a complex and stressful process that involves identifying, selecting, and purchasing required items for sustaining everyday lives. The grocery list is one of the most common solutions that assists in carrying out these activities. However, the task of creating and managing the lists for grocery shopping is often overlooked where the efforts and time spent are typically unseen and unrecognized. This study aims to bridge the gap by designing and developing a modern shopping list application that facilitates the process of creating and managing shopping lists for busy workers and assists them in locating products at nearby stores.

Existing shopping list applications lack an effective way to map shopping lists written in natural language to actual products in supermarkets. In addition, most mobile shopping assistants with search functionality rely on product information manually entered by the retailers and thus the recommended product lists are specific with limited options. It can be inconvenient and time-consuming for grocery shoppers who are in a hurry to locate specific products, especially when they are not familiar with the nearby stores.

To address these issues, we designed a location-aware shopping list application that can locate products from nearby stores. With the API services and website information from supermarkets, it becomes possible to provide users with the option to choose from all the products available online. Details of the relevant products are displayed in the search results, along with a navigation map showing all the nearby stores that carry the products. Additionally, once selected, when the product is available in a nearby store, a notification will be sent to the user. To streamline product selection process, our application supports ranking the product list based on the purchase history of the user.

Our implementation began with a proposed user story for typical grocery shopping, followed by a derived system specification to efficiently support the hypothetical shopper. We then designed and developed a multi-tier system to prototype the modern shopping list application based on the specified requirements. The evaluation results illustrated the completeness of the prototype system, including grocery list management, navigation map and location-aware notification. The results from a small-scale study showed that the personalized search ranking system achieved its initial success in integrating user preferences and the specified items in recommending personalized and appropriate products for different users. The results from our study contributed to the understanding of system and user interface requirements of a shopping list application. Our project and results can serve as an effective reference for developers and researchers in the field when developing similar applications.

Friday, March 4

RAHIL MEHTA

Chair: Dr. Dong Si
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: CT Metal Artifact Reduction using Unpaired Image-to-Image Translation

Computed tomography (CT) scans are a common diagnostic technique in medicine, used for the management of a range of conditions. One factor that can degrade the quality of CT scans is the presence of metal implants in the patient’s body. The presence of metal can cause streaks or bright spots on the image, which are known as artifacts. Metal artifacts can make it more difficult for doctors to interpret an image, potentially impacting the quality of a patient’s care. The traditional mathematical methods for reducing artifacts are limited in their effectiveness and can produce undesirable secondary artifacts. In recent years, researchers have been applying machine learning techniques such as convolutional neural networks and generative adversarial networks (GAN) for CT metal artifact reduction and have had better results. The task of removing artifacts can be understood as translating images from one domain to another. The goal of this project is to apply the machine learning technique of contrastive unpaired translation to metal artifact reduction and explore changes to the network architecture. We used the SpineWeb and DuDoNet datasets to evaluate the effectiveness of our method. The results show that CUT can effectively eliminate most metal artifacts and it was more effective at removing or reducing certain types of metal artifacts than CycleGAN and DualGAN. We explored the addition of the Convolutional Block Attention Module and found an improvement of greater than 10% for color images from the SpineWeb dataset, based on the Frechet Inception Distance and Kernel Inception Distance. Areas for future work include training on a larger dataset, obtaining a greater diversity of data, testing a greater number of training parameters, and exploring more changes to the network architecture. 

Monday, March 7

AFROOZ RAHMATI

Chair: Dr. Afra Mashhadi
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.; Online
Project: Sound classification using Deep Embedded Clustering and Federated Learning

Applications of machine learning have an opportunity to positively impact a variety of fields. Recent work in digital health applications have demonstrated how ML can be leveraged to improve understanding of human health for both populations and individuals; however, there are many challenges that need to be addressed. Federated Learning has been adopted to remove the data pooling requirement for developing AI-models, but the majority of research in this domain focuses on cross-silo applications using patient data from multiple clinical institutions. Research evidence in cross-device applications has been rare though. Such research would enable researchers to build models based on a more diverse dataset rather than developing models that are biased by the patient populations of clinical institutions. In this project, we present an end-to-end system for cross-device FL in identifying abnormal heartbeat sounds. We posit a fully unsupervised learning model based on LSTM-Autoencoder and Deep Embedded Clustering for detecting heartbeat biomarkers based on audio recordings collected through smartphone and digital-stethoscopes. We evaluate the performance of our model in comparison with the existing benchmark algorithms and report its competitive performance despite its comparatively light architecture. We show that our model is able to learn to distinguish normal and abnormal heart sounds under the FL setting, achieving 97% accuracy on the PhysioNet Heart Dataset. We validate the feasibility of our approach in terms of energy, memory, and computation power on the ordinary smartphones. 

Tuesday, March 8

DIEM TO

Chair: Dr. Dong Si
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.; Online
Thesis: Traveler's Next Activity Predication with Location-Based Social Network Data

The rise of technology and the internet provides powerful means for people from all around the world to communicate and connect with one another. Online social network platforms become go-to places for users to express and share their individuality, which includes choice of activities, locations and associated timestamps. In turn, their opinions affect the point of view of others, who are in their online friendship circle. Users’ increasing usage of social networks help accumulate massive amounts of data that can be further explored. Particularly, this type of data attracts and allows researchers, who are interested in studying and understanding how social factors and previous experience influence user behavior in terms of activity-related travel choice. In this paper, the goal is to utilize such rich data sources to build a model that predicts user next activity. Such model contributes a powerful tool for integrating the location prediction with transportation planning and operations processes. Besides, it is valuable in commercial applications to create a better recommendation system with higher accuracy and ultimately attract more customers to partnering businesses. By studying the dataset, which contains millions of historical check-ins from thousands of users, it is possible to derive information that is useful in predicting user next activity. The proposed approach applies machine learning techniques on the collected features to deliver highly accurate prediction results with fast training and prediction time.


SAYALI KUDALE

Chair: Dr. Hazeline Asuncion
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: Security Patterns Discovery to Mitigate Vulnerabilities

In the IT security industry, vulnerabilities in software are a significant concern, as malicious hackers can often exploit these weaknesses for unethical reasons. Identifying and fixing software vulnerabilities early in the development process can reduce costs and prevent reputational damage and litigation. The one way to address these security vulnerabilities is to use security patterns during the early phases of the software lifecycle; however, software developers who understand the design and implementation of software features and functionalities are often not cybersecurity experts, so it is difficult to choose an appropriate security pattern manually from the vast security pattern catalog. Moreover, the amount of work on vulnerability prediction is substantial, but little has been done on security pattern prediction for a vulnerability.

In order to address identified security vulnerabilities, this study proposes a new approach for predicting security patterns using extracting keywords and text similarity techniques. We have worked on three datasets by parsing the public websites and technical documentation: 1. The security patterns. 2. The common weakness enumeration (CWE) vulnerabilities. 3. Open-source repositories CWE vulnerabilities and associated GitHub code fix commit messages. The Security Pattern Discovery to Mitigate Vulnerability (SPDMV) algorithm was executed on these datasets, and the mapping of CWE and security patterns were obtained.

The ground truth data is generated manually by assignment security patterns as the solution to the CWE software development categories and reviewed by experts. To assess SPDMV, we compared two keyword extraction techniques for each vulnerability category - MALLET topic modeling and Rapid Automatic Keyword Extraction (RAKE). The evaluation results indicate that the SPDMV algorithm can recommend security patterns for the most frequently occurring CWE vulnerabilities using the Rake keyword extraction approach by attaining 70% average precision. In the future, studies by including the data of other vulnerability sources such as the Open Web Application Security Project (OWASP) may improve the performance.

Wednesday, March 9

CRAIG RAINEY

Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: Algorithmic Stock Trading Framework

Well-funded hedge funds and banks have taken advantage of computerized stock trading as early as the 1960s. Recently, personal computers, infrastructure, and data have become accessible for retail traders. These tools allow traders to develop and build their own algorithmic trading strategies based on a variety of data sources like market prices, sentiment, or news. Automated trading strategies can provide an edge over humans, but they require time and effort to develop and monitor. The purpose of this project was to implement a fast and scalable framework that supports multiple algorithmic trading strategies. I created two strategies: one was a rule-based momentum strategy, and the other was an AI-driven strategy that combines market and sentiment data. The rule-based momentum strategy used several technical indicators to define profitable entry and exit criteria over time. For the AI-driven strategy, technical indicators derived from market price data and sentiment data extracted from Tweets were used to train a machine learning model on how to profit from trading stocks. Then, I created a framework that enabled the automation of these algorithmic trading strategies by streaming real time data and integrating with a stockbroker API to manage order and accounts.

Thursday, March 10

SEETU AGARWAL

Chair: Dr. Yang Peng
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.; Online
Project: Accelerated Inference on Mobile Devices via Adaptive Model Selection

With the development of model compression, model selection, and other inference acceleration techniques, more complex Deep Neural Network (DNN) models can run on various resource-constrained mobile devices. To achieve a satisfactory user experience, DNN models must be adaptively selected for the hardware characteristics of mobile devices to balance important performance metrics such as accuracy, latency, and power consumption. State-of­ the-art methods select the best model based on image features captured by mobile devices or contextual and user feedback information. This research work designs a novel framework that comprehensively considers image features, mobile contextual information, and user feed­ back for selecting the best DNN model for inference on mobile devices. The framework first utilizes a series of KNN models running on edge servers to filter out a suitable subset of models based on image features. After obtaining this subset, the mobile device selects the best model for the current context using a model selection algorithm, which uses contextual information such as ambient brightness, battery level, CPU temperature, and DNN accuracy and latency. This algorithm continuously improves the Quality of Experience (QoE) through reinforcement learning. The proposed solution has been evaluated through the image classification task using the ImageNet ILSVRC 2012 validation dataset. Experimental results show that our method achieves 79.8% top-1 accuracy and 96.1% top-5 accuracy, which is higher than most accurate single DNN models. Its average inference time (1.48 s) is also much shorter than most of the individual DNN models. Additionally, the proposed solution achieves an average QoE of 0.5862, which is the highest in comparison with Mobilenet (0.4423), Mobilenet-V2 (0.5123) and Inception-V3 (0.5523).

Friday, March 11

YASHASWINI JAYASHANKAR

Chair: Dr. Hazeline Asuncion
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.; Online
Project: Parallelizing LDA-GA (Latent Dirichlet Allocation - Genetic Algorithm) for Data Provenance Reconstruction

On the Internet, data can be created, copied, modified, and deleted easily, making it hard to rely on the authenticity of information sources and confirm their reliability. Therefore, it is necessary to reconstruct data provenance in the absence of previously documented provenance information. The Provenance-Reconstruction approach proposed by the "Provenance and Traceability Research Group", based on the Latent Dirichlet Allocation Genetic Algorithm (LDA-GA), which uses a Mallet library, was implemented in Java and achieved satisfactory results when applied to small datasets. As a result of the increase in datasets, performance degraded. To improve accuracy and performance, the GALDAR-C++, a Multi-Library Extensible Solution for Topic Modeling in C++, was developed. As compared to a Java implementation, this solution using WarpLDA offered satisfactory results.

Parallel computing allows code to be executed more efficiently, saving time and money while sorting through 'big data' faster than ever before. This project aims to apply a parallel computing strategy, Message Passing Interface (MPI), on both the Java and C++ versions of code by parallelizing the LDA calls in each generation of the LDA Genetic algorithm. Both the implementations have performance improvement as compared to their respective serial implementations. The performance improvement of original C++ to Parallel C++ versions varies between 10% and 64% depending on input size. Similarly, when the original C++ was compared to the parallel Java version, it gave performance improvement by 35% and 78%, depending on the input size. As a whole, for bigger data sets, parallel C++ provides the best results, approximately 2x speed up, while keeping the same accuracy. In the future, studies using large datasets on provenance reconstruction may find this to be a feasible solution.

AUTUMN 2021

Thursday, December 2

APOORVA BHATNAGAR

Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.; Online
Project: GIF Image Descriptor

Social media platforms feature short animations known as GIFs, but they are inaccessible to people with vision impairments. Such individuals rely heavily on accessibility tools such as screen readers to help them understand the context of an image. In this paper, we present a project that aims to improve the accessibility of web pages by auto-injecting machine-generated but accurate commentaries of the GIF image content into the alt text tag of the image on the web. We use machine learning and web browser extension to create a software system, the “GIF Image Descriptor,” to solve an often-overlooked issue: web accessibility for non-static images. This system uses a CNN-LSTM architecture to generate descriptive texts for GIF images because CNN prospers at encoding images and videos, and LSTM works well with sequence generation. We examine two very popular model architectures of 2D CNNs and 3D CNNs for generating our first neural network, and settle for a hybrid 2D CNN model that can take multiple frames of a single image to create vector embeddings, which then become the input to our LSTM neural network.  This approach gives us substantial improvement in accuracy metrics such as BLEU (7% improvement), ROUGE (6% improvement), and METEOR (10% improvement) when compared to existing TGIF model that generates captions for GIF images. A usability study performed with a focus group of individuals with and without vision disabilities further proves that the system performs well for its targeted audience.

Friday, December 3

PINKI PATEL

Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: Yoga Pose Correction and Classification

Yoga is an ancient physical science that originated in India over 5,000 years ago. It has been popularized in the West due its various benefits. A lot of people are participating in performing yoga by themselves through watching TV or videos. However, it is not easy for novice people to find incorrect parts of their Yoga poses by themselves. Therefore, in this paper we are introducing a web application, YogaAndPoses, which provides two-fold benefits to users: Yoga Pose Correction and Yoga Pose Classification.

The goal of Yoga Pose Correction is to use pose estimation technology and allow a person to practice different yoga poses using their webcams and receive feedback on their yoga poses. Since Pose Estimation is a well explored area in Computer Vision and Deep Learning, we are leveraging the public knowledge and using it for our application. Our Yoga Detection uses CMU’s open-source pose estimator OpenPose as a pre-processing module in order to detect the joints in the human body and extract their keypoints. We use these keypoints to calculate the angle difference between the yoga pose of an instructor/expert and a novice performing yoga and suggest the correction if the angle difference is larger than the given threshold. For evaluations, we have applied the Yoga Pose Detection to three people and confirmed that it found the incorrect parts in the yoga pose.

We are introducing Yoga Pose Classification that classifies 82 Yoga Poses which are very challenging in terms of viewpoints of yoga poses. Our yoga pose classification allows a user to upload an image of a yoga pose or capture a yoga pose using the user's web camera and classify the yoga pose using the DenseNet-201 pretrained model. Before sending an image to DenseNet-201, we perform various pre-processing steps on the image such as resizing, augmentation, segmentation and converting the image to RGB if it is a grayscale image. We have obtained results of 64% for top 1 accuracy and 78% for top 3 accuracy for 82 yoga poses where others’ state of the art yoga pose classifier was able to achieve similar accuracy with maximum of 20 yoga poses.

Monday, December 6

SUKRITI TIWARI

Chair: Dr. David Socha
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Art Website: UX Design and Development

The aim of this undertaking is to design and develop an art website for an individual artist – the author’s mother - Mrs. Kavita Tiwari. The website will act as an online portfolio, help showcase her skills and advertise her offerings. Artists often find it difficult to market their products as they are engrossed in perfecting their skills and lack the necessary marketing prowess. The intention of creating a website is to take a concrete step towards becoming an established presence online for streamlined sales.

 The methods opted for reaching the goal of a full-fledged artist website include determining the scope of the project through conducting semi-structured user interviews and competitive analysis of existing websites, identifying and segregating the types of users of the website, understanding the user’s needs, selecting features to be implemented on the website, accordingly, designing and creating a prototype using Figma, and finally developing and testing the website. The author wishes to use this opportunity to help her with sales as well as learn full-stack web development.

Tuesday, December 7

TEJASWINI MADAM

Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.; Online
Project: MealFit - Health Monitoring Mobile Application

Obesity is one of the major problems many people are facing today. There has been a rapid increase in dietary ailments during the last few decades caused by unhealthy food routine. Mobile based dietary assessment systems that can record meal consumption and physical activities can be very handy and would improve dietary habits. This paper proposes a novel system that allows users to capture images of the meal consumed and upload them into the mobile application. The mobile application uses a pre-trained Convulational neural network model, MobileNet to classify the meal images. In order to improve the accuracy, pre-processing has been performed using Keras image data generator, which takes a batch of images and applies random transformation to each image in the batch. In this way multiple copies of the images in different parameters would be generated for the training process. Users can monitor consumption details via mobile application and receive regular alerts on the percentage of calories consumed with respect to the assigned daily calorie limit. Additional features to maintain the daily calorie consumption, such as access to different levels of exercises, nutrition tips and calendar modules, are developed in order to motivate the users. The experimental results demonstrate that the application is able to recognize the food items accurately with over 87% validation accuracy. The application is user friendly and is able to generate meaningful notifications according to  the usability studies, which will benefit the user with a clear insight of healthy dietary and guide their daily consumption to improve body health and wellness.

Friday, December 10

CARLA COLAIANNE

Chair: Dr. Erika Parsons
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.; Online
Project: Developing Autonomous Mining Through a Simulation and Reinforcement Learning

Fully autonomous mining could provide many opportunities to safely and efficiently mine materials on earth and in extraterrestrial environments. Due to the complexity of autonomous mining, having a method to test out various solutions rapidly and safely is beneficial. This leads to the development of simulations that accurately represent the real excavator and the environment it is interacting with. In addition to using simulations, reinforcement learning is considered a promising application for developing autonomous algorithms for the complex situation of mining material. Reinforcement learning provides the key advantages of being able to interact with and learn from a changing environment. This project creates a simulation to model a robotic excavator and its interaction with mining lunar simulant soil. It then uses reinforcement learning to create an autonomous mining algorithm capable of reaching a certain depth in the soil and producing motions that can overcome any soil resistance the excavator may face.

SUMMER 2021

Thursday, July 29

VICTORIA SALVATORE

Chair: Dr. Michael Stiber
Candidate: Master of Science in Computer Science & Software Engineering
12:00 P.M. (noon); Online
Thesis: Demonstrating Software Reusability: Simulating Emergency Response Network Agility with a Graph-Based Neural Simulator

This research validates the re-engineering of a neural network simulator to implement other graph-based scenarios. Most components were abstracted to increase reusability and maintainability through strategic refactoring decisions. This paper demonstrates how the simulator, developed at the University of Washington Bothell, can be applied to another graph-based problem: the resilience of the US’s Next-Generation 911 (NG-911) system in the face of a crisis. This research focuses on separating the neurospecific components from the architecture of the simulator and verifying its functionality as reusable software. It also includes first-person interviews, literature reviews, data analyses, and NG-911 system research to establish the system requirements for the NG-911 test-bed. Initial results demonstrate that when a crisis destroys critical parts of emergency response infrastructure, the NG- 911 test-bed can reroute calls. This can support future work that will investigate the patterns that emerge from the interconnected events of a regional emergency response network. By applying previous research findings on the self-organizing behavior observed in both neural networks and emergency response networks during catastrophic events, this research will also contribute to the demonstration of self-organized criticality in complex networks. The NG-911 implementation of the simulator intends to model the resilience of emergency response infrastructure at varying levels of network connectivity.

Monday, August 2

KHADOUJ FIKRY

Chair: Dr. Clark Olson
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Photo Mosaic Reassembling Images & Image Matching Techniques

Art often produces an artifact that is appreciated for its richness, meaning, message, and emotional power. Mosaic is an artwork that involves setting pieces of small tiles together to create an art piece. An image captures a moment in time and results in an artifact that can be printed or digitized.  Photomosaic is an engineering solution that combines the techniques of art, mosaic, and images.  Given an input and a library of images, the algorithm mimics mosaic techniques to regenerate the input image by stitching images from the library together. Selecting the right image to stitch at the right position to maintain the original image features and look requires intensive image analysis and matching techniques.  This project focuses on quality Photomosaic by honing at the different image analysis techniques.  A combination of a weighted BGR and a large color histogram to extract image statistics produces a quality photomosaic that is very similar to the original image.  Numerous image analysis techniques are articulated in this project to help quantify and conclude the effectiveness of this proposed solution. 

Tuesday, August 3

MARCEL OKELLO

Chair: Dr. Marc Dupuis
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: Password Feedback Meters: The Development and Validation of a Testbed to Examine the Efficacy of Peer-Feedback and Other Password Meter Types

Passwords have become entrenched in front-end authentication. Over the years, system administrators and security experts have stressed that users should create strong passwords with little information on how to actually accomplish this. Many websites and apps will provide a password strength meter to gauge if the user is complying with the password policies of the given site or app. While these meters have proven effective at forcing users to create stronger passwords, they have not encouraged users to intrinsically want to create a strong password. With many users feeling like creating a password is a cumbersome, secondary task, we wanted to see if this could be improved. Based on the concept of social influence, we developed a password strength meter testbed to determine if a password feedback meter that uses peer-feedback is more efficacious than traditional password feedback meters, no password feedback meter, and a novelty password feedback meter. We evaluate if this helps users not only create stronger passwords due to mandatory requirements, but want to create stronger passwords intrinsically. The testbed itself is adaptable and designed in such a way that other types of password feedback meters may be employed for testing. Our results are interesting and our next steps would be to improve the peer-feedback meter to provide targeted feedback to the specific user based on their password selection.

Wednesday, August 4

DYLAN SHALLOW

Chair: Dr. Erika Parsons
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Document Clustering with Knowledge Graphs

Modern problems steadily require the management, processing and storage of increasingly larger datasets, which in turn requires harnessing larger computational resources. An example of such a problem is clustering documents using knowledge graphs. Representing documents with knowledge graphs will enable scholarly article databases to better serve researchers by providing accurate and interpretable information about the relationships between documents. This can be used to present better search results and suggestions for related works. In this work, we propose the use of knowledge graphs for this purpose. Knowledge graph representation and clustering of data may be applied to other areas of research like data provenance, search results clustering, and abstract synthesis. Performing the necessary operations on large graphs requires the use of tools like MPI, which can conveniently scale up to meet our needs. The use of a compute cluster, Purdue Scholar, is introduced to accommodate larger computing resource needs since the problem at hand is computationally expensive and it has made it possible to make strides in this project. Researchers and developers need access to these resources as well as the expertise and tooling to write efficient and correct software that can take advantage of those resources. As part of this project, we also share experiences and lessons learned, with the goal of paving the road for future UWB students and to encourage more focus in the area of high performance computing (HPC) and remote development on this or other projects, trailblazing collaboration opportunities. Quantitatively, our results show how we are able to use MPI to harness the available performance of Purdue Scholar, scale up our system, and process our datasets in a reasonable amount of time compared to previous efforts.

Thursday, August 5

JOE LEUNG

Chair: Dr. Kelvin Sung
Candidate: Master of Science in Computer Science & Software Engineering
1:00 P.M.; Online
Thesis: An User Programmable System Agnostic Real-Time Ray-Tracing Framework

Ray tracing rendering is a well understood rendering technique that can produce photorealistic or complex visual effects. Until recently, customizing ray tracing systems posed challenges because of lack of infrastructure to program the rendering pipeline. To address the customizability issue in ray tracing pipeline, industry has been proposing their own ray tracing frameworks to integrate user programs. However, most of the existing solutions target proprietary platforms, which limits the access to ray tracing for the general public. In this project, we propose a platform agnostic ray tracing framework that is based on general programming capacity found in commodity graphical processing units (GPUs). The framework supports six types of customizable user programs, including custom geometries, shading, and lighting. The framework also provides the developers with utilities such as built-in acceleration structure, secondary ray generation, and material instancing. Our framework has been demonstrated that it is able to fulfill the common functional requirements found in existing commercial products without the platform limitations. We also identified tradeoffs in developing a flexible, cross-platform ray tracing framework running on GPUs. Our results provide insights to future development of similar ray tracing framework.

Friday, August 6

BENJAMIN D. PITTMAN

Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: Multi Agent Spatial Simulation (MASS) in multiple GPU Environment with NVIDIA CUDA

Agent-based models are used to simulate real-world random and pseudo-random systems. These models leverage many autonomous agents and require high computation capabilities. The Multi-Agent Spatial Simulation (MASS) grew from this need and is now implemented independently in Java, C++, and CUDA. MASS implements Agents and Places to both act amongst and with each other where Places remain resident and Agents may migrate to other Places.

The cloud computing infrastructure, its ever expanding compute capabilities, and the increasing demands for these resources to solve challenging problems, demand MASS CUDA grow to leverage multiple graphical processing units (GPU). This research accomplishes this through refactorization of the existing code base to initialize models on multiple GPU devices using parallel computing algorithms and strategies. These improvements include (1) ghost-spacing in the globally distributed Place array, (2) GPU direct communication, and an (3) application specific garbage collection and provisioning scheme for Agents.

This MASS CUDA implementation allows larger model sizes and faster processing time for simulations larger than 10,000 Place objects and shows an increase for size of models developed in MASS CUDA of more than double the previous implementation. This research implemented a neural signal simulation that is not finished. The simplified BrainGrid implementation requires Agents to be added to the simulation throughout processing and this research implemented Agent termination and refactored Agent spawning to facilitate. The dynamic Agents’ implementation allows for a greater number of simulations using the MASS CUDA library.

MASS CUDA Multiple GPU (MGPU) is an efficient choice for large simulation sizes. Future research should test different data apportioning schemes over devices; test sharing and splitting computations amongst host and devices; and test related strategies for coalescing memory in the system and with like computations.

Thursday, August 12

QIAOYU (IRIS) ZHENG

Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: Seme-Automatically Style Landscape Site Plans With Machine Learning

A landscape site plan is a graphic representation that shows the arrangement of landscape elements such as trees, buildings, ground, grassland and water from an aerial view. Restyling a site plan includes adjusting the colors, textures and other artistic customization for various elements in the plan. This is a common task for contemporary landscape architects who spend most of their time working on computers. Our research develops a web application semi-automatically restyle landscape site plans based on machine learning. The system’s working flow are listed as the followings: a) receives the user’s uploaded site plan image; b) recognizes the semantic meaning of the image; c) visualizes the recognition results; d) provides an interface for modifying and customizing the detected result and the style setting; e) styles the uploaded image based on the detection results and user’s modification and customizing. This report presents our recent approach of combining two machine learning models to detect elements on a landscape site plan: using Mask-RCNN model to detect the single elements such as tree elements and building elements; using U-Net model to detect the surface elements such as grassland elements, ground elements and water elements. This paper also describes the training process and the app’s interface design. Additional tests are conducted for the further evaluation of the application’s performance which includes IoU score on single element, IoU (intersection over union) score on surface element, time consumption, user-experience satisfaction survey score and restyle quality satisfaction survey score.


NUOYA (NORA) WANG

Chair: Dr. Dong Si
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Developing a health therapy chatbot using Natural Language Processing

Family caregivers experience a significant caregiving burden that negatively impacts their physical and mental health. The current support system for family caregivers is fragmented with limited access and high cost. In this project we aim to build a practical health therapy dialog system based on Hybrid Code Networks (HCNs) and Problem-Solving Therapy (PST) to help them schedule tasks, manage emotional needs and alleviate their symptoms. We developed a Wizard of Oz system to conduct therapy sessions for data collection. We processed the collected data with entity annotation, utterance augmentation and embedding. We created a new domain-specific model - PST module into HCNs, trained and compared our system with the original HCNs and other end-to-end task-oriented dialogue systems. Result shows that our system achieved high dialogue accuracy and outperformed other systems. We finally designed a web interface and API that allows users to interact with the chatbot we build.

Friday, August 13

JESSICA FRIERSON

Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.; Online
Project: A Textual Data Extraction Application to Generate Events from Mobile Camera Image

Pictoevents is an application that is inspired by the complexity found within parenting plan schedules. Because of the sensitivity of such schedules, data is not openly available. Therefore, this project cannot directly accommodate parenting plan schedules. Instead, we propose to solve the problem of manually entering dates for an event as an entry point to addressing parenting plans. Most calendar applications require the user to manually add an event apart from peripheral tools such as voice commands that can create an event. To simplify the process of creating a new event on a mobile device’s calendar, this project aims to create a simple application that can create a calendar event with minimal user action. To accomplish this, computer vision techniques are applied through APIs to process an image taken by the user and extract text via optical character recognition. The extracted text is parsed using regular expressions and added to a template data structure that helps populate the necessary data needed to create an event on an Android mobile device. A title for the event is also generated using Natural Language Processing that identifies a title composing of the most relevant words. This is achieved using a pipeline of various techniques to identify subject, verbs, and noun as well as noun chunking to find an event title. The result of this project is an application that can identify an event from images and conveniently add it to the devices’ calendar.

SPRING 2021

Friday, May 21

SANJUSHA CHEEMAKURTHI

Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: Mobile application for sign language learning

Sign language plays a significant role in bridging the communication gap between the deaf, mute and general public. Though there are sign language learning applications available these days, there is a considerable deficiency in the area of real time testing. This research aims to provide a solution for this challenge by building a real time sign language learning application on mobile devices. This iOS application is used to learn alphabets and digits of American sign language (ASL) and it also helps users to evaluate their skills in real time. To facilitate real time evaluation, a machine learning model is trained using ASL image dataset. This dataset is a combination of static sign language gestures captured using an iPhone, dataset developed by Massey University and publicly available ASL alphabet from Kaggle. Preprocessing is performed to bring all the images in the dataset to a common size. Later, segmentation is performed to subtract the background using skin color detection. These processed images are used to train the model using Convolutional neural network. The model consists of multiple convolutional layers and filters which are helpful in extracting the features and training the model effectively. The proposed framework is successfully implemented on smart phone platform and performs with an average testing accuracy of 99.7% using 5-fold cross-validation and evaluation accuracy of 70%.


SATINE PARONYAN

Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Agent-Based Computational Geometry

The Multi-Agent Spatial Simulation (MASS) library is a parallel programming library that uses agent-based modeling (ABM) parallelization approach over a distributed cluster. The MASS library contains several applications solving computational geometry problems using ABM algorithms. This research aims to build additional four ABM algorithm-based applications: (1) range search, (2) point location, (3) largest empty circle, and (4) Euclidean shortest path. This research presents ABM solutions implemented with MASS library as well as divide and conquer (D&C) solutions to four problems implemented with big data parallelization platforms MapReduce and Spark. In this paper, we discuss design approaches used in solutions for the four problems. We present ABM and D&C algorithms with MASS, MapReduce, and Spark platforms. We provide a detail analysis of programmability and execution performance metrics of ABM algorithm-based implementations with MASS against D&C algorithm-based versions with MapReduce and Spark. Results showed that MASS library provides an intuitive approach to developing parallel solutions to computational geometry problems. We observed that ABM MASS solutions produce competitive performance results when performing computations in-memory over distributed structured datasets.

Monday, May 24

XIAOTIAN (ALEX) LI

Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.; Online
Project: Agent-based parallelization of multidimensional semantic database model

In normal database systems, people always ignore the semantic meaning in autonomous databases. It is not easy to extract significant information based on different querying contexts from a database system. Mathematical Model of Meaning (MMM) is a meta-database system that extracts features from the database and explains the database by those features. It provides users with the capabilities to extract significant information under different semantic spaces. The semantic space is created dynamically with user-defined impression words to compute semantic equivalence and similarity between data items. MMM computes semantic correlations between the key data item and other data items to achieve dynamic data querying.

Multi-Agent Spatial Simulation library (MASS) is a parallel programming library that utilizes agent-based modeling (ABM) to parallelize big data analysis. This project presents parallel solutions to improve the performance of MMM using MASS. Multiple parallel solutions were implemented to improve the efficiency of MMM. Compared to the sequential MMM program, the parallel solution using MASS achieved 23 times speedup over the sequential program on matrix multiplication. MASS also reduced the processing time of distance sorting of multidimensional vectors by 23.70%. Additionally, this work also conducted benchmark analysis between MASS and MPI Java to indicate the advantages of agent-based behavior. It also performed the quantitative and qualitative programmability analysis regarding boilerplate ratio, number of extra classes and functions and developing effort. 


SNEHA MANCHUKONDA

Chair: Dr. Min Chen
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Classification of Customer Reviews

Fake review is a review written by someone who has not used the product or the service. Fake reviews are designed to give false impression to the customer during the time of purchasing. Emotion, competition, and intellectual laziness are some of the reasons for writing fake reviews. They effect the purchase decisions and result in the financial losses to the customers. Therefore, an effective solution is necessary to identify the fake reviews. Existing approaches of fake reviews detection consists of a machine learning models with features related to reviewer, review, and social. The problem with the existing approaches is that the model and the features to the model are website dependent. Different websites like Yelp, Amazon, Trip Advisor consist of different levels of user meta data information for the reviews. Some websites might be containing rating, name, verified purchase in the reviews section while others might not. To address this issue, we develop two uber machine learning models, that can be plugged into any website, which takes text of the review. The textual features obtained from reviews are fed to the machine learning models to predict the classification of review. The average accuracy of fake reviews detection among different websites is 80%. In the future, any website can extend the current machine learning model, by adding more reviewer features pertaining to that website, for greater accuracy.

Wednesday, May 26

ROBERT LAURENSON

Chair: Dr. Clark Olson
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Thesis: Method of Adding Color Information to Spatially-Enhanced, Bag-of-Visual-Words Models

This thesis provides a late-fusing method, based on the HoNC (Histogram of Normalized Colors) descriptor, for combining color with shape in spatially-enhanced-BOVW models to improve predictive accuracy for image classification. The HoNC descriptor is a pure color descriptor that has several useful properties, including the ability to differentiate achromatic colors (e.g., white, grey, black), which are prevalent in real-world images, and to provide illumination intensity invariance. The method is distinguishable from prior late-fusing methods that utilize alternative descriptors, e.g., hue and color names descriptors, that are lacking with respect to one or both of these properties. The method is shown to boost the predictive accuracy by between about 1.9% - 3.2% for three different spatially-enhanced BOVW model types, selected for their suitability for real-time use cases, when tested against two datasets (i.e., Caltechl0l, Caltech256), across a range of vocabulary sizes. The method adds between about 150 - 190 mS to the model's total inference time.  

Thursday, May 27

ZICAN LI

Chair: Dr. Wooyoung Kim
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: Network Motif Detection: Toward efficient and complete package

Network motifs, with their statistical significance, are frequent and unique subgraph patterns in a network. Although various algorithms for network motif detection have been introduced, a traditional approach includes the following three steps. First, it enumerates all the occurrences of subgraph patterns in a given size from the original graph. After that, it generates a large number of random graphs with the same degree sequence as the original graph and repeats the first step. Lastly, it determines the significance of subgraph patterns through statistical analysis. In 2007, Grochow and Kellis proposed a new approach which is a motif-centric method and differentiated it from the traditional approach as a network-centric approach. While the network-centric approach finds network motifs from a given input, the motif-centric approach finds the occurrence of a specific subgraph pattern from the original network. To provide both approaches in one platform, we developed a web-based program, Nemo, which includes both network-centric and motif-centric programs.

On the other hand, while we were investigating the network-centric approach for a possible improvement of computational efficiency, we realized the second step in the traditional approach can be greatly improved. The second step is to generate a large number of random graphs so that the network motif can be determined by comparing the frequency in the random pool. Wernicke classified it as EXPLICIT because it generates random graphs explicitly. He, then, proposed an alternative algorithm named DIRECT, which determines network motifs without explicit graph generation. However, it was never adapted to detecting network motifs in practice due to its ambiguous statistical testing method. Here, we investigated DIRECT method, implemented the algorithm with different statistical measurement, and applied to various biological networks to detect network motifs. Experiment results support that for subgraphs in small sizes, the DIRECT method is a feasible alternative for EXPLICIT, since they have consistent results, but with superior performance.

Additionally, we added the motif-centric algorithm and the DIRECT method of network-centric approach as an extension to an open-source library NemoLib, which originally contained a network-centric approach with EXPLICIT random generation method only. We expect the NemoLib with the additional features can be useful to accelerate the use of network motifs in real-world applications.

Friday, May 28

ISWARYA HARIHARAN

Chair: Dr. Erika Parsons
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Associate-Degree-Plan scheduling and recommendation system improvement for Virtual Academic Advisor system

Community college students come from diverse backgrounds and experience levels.  They begin their education path pursuing a degree in a major of their choice. Most students aim to get transferred to certain universities, an academic path that demands to fulfill specific requirements, which makes students eligible for the transfer. Academic advisors at community colleges help students in creating academic plans trying their best to incorporate students’ interests, life constraints, and background. Being a heavily manual process that demands experience and familiarity with the process, there is a clear need to automate this process. The Virtual Academic Advisor (VAA) system aims to address the problem of automating academic plan creation for community colleges. The VAA is a research project paired with the development of an interactive software system that supports creating and displaying academic plans based on the needs and preferences of students. Work previously done by various students, focused on automated recommendation of core courses for targeted majors. However, no research or development has been done to incorporate selection of elective-course choices when generating an academic plan, nor a clear strategy on how to integrate elective-recommendation with the VAA system has been outlined.  Incorporating electives opens up a whole new research aspect of automated scheduling.  Furthermore, elective-course selection is crucial for scheduling associate degrees plans.  Associate degrees are offered by community colleges and students can earn such a degree before/without getting transferred to a university.  In this capstone project, we incorporate the logic and functionality of scheduling elective courses along with the core courses to generate associate degree schedules for the intended major and university of the student. We gather and collect the necessary data for the elective courses and test our scheduler for the associate degree schedules. This project also addresses the research and implementation necessary to generate alternative-schedule recommendations and its integration with the VAA system using APIs. This will assist students in exploring alternate academic paths.

Tuesday, June 1

DANIEL BLASHAW

Chair: Dr.  Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
8:45 A.M.; Online
Project: Interactive Environment to Support Agent-Based Graph Programming in MASS Java

MASS is an Agent-Based Modeling (ABM) library that allows for parallelized simulation over a distributed computing cluster. In these simulations, Place objects act as the environment for Agents to interact with and may be internally organized as multi-dimensional arrays, graphs, or trees. Graph-based simulations are best suited for systems where Places’ relationship to one another may dynamically change or where Places have an indefinite number of neighbors, such as in social networks. However, these graphs are often very complex and present increased difficulty of debugging and verification for the programmer. To address this problem, the goal of this project is to extend the MASS Java library to include a development environment which allows the programmer to step through a graph-based ABM and visually inspect associated Places and Agents. To accomplish this successfully, we have incorporated Java’s JShell for line-by-line execution, checkpointing, and rollback of a simulation; expanded MASS-Cytoscape integration with a full control panel, Agent visualizations, and choice to view subgraphs from MASS; and added Agent Tracking functions to the MASS API. These additions result in a development environment which allows programmers the flexibility to rapidly explore and iterate graph-based ABMs, free to focus on the logic of their simulations and not the infrastructure needed to validate their output. Further, although the functionality discussed in this project were designed for graph-based ABMs, their implementation benefits many other non-graph applications and provide a solid foundation for further expansion of the MASS Java library, such as with real-time cluster monitoring and visualization of other simulation data structures.


JEFFY JAHFAR POOZHITHARA

Chair: Dr.  Hazeline Asuncion
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Thesis: Automated Vulnerability Prediction in Software Systems and Lightweight Identification of Design Patterns in Source Code

The heavy investment in terms of cost and time that software development companies put into post-production support required for fixing security vulnerabilities in their products demands the need for an automated mechanism to identify these security vulnerabilities during and after software development. Such an approach can help in reducing these costs by including corresponding solutions like security design patterns when making architectural decisions. This will reduce system-wide architectural changes required post-development and enable efficient documentation and maintenance of the software systems. As part of this research, we created a natural language processing-based model that predicts security vulnerabilities in software systems using keywords and n-grams extracted from software documentation. We analyzed the correlation of certain keywords and n-grams with the occurrence of various security vulnerabilities as well as the correlation between different vulnerabilities. Additionally, we analyzed the performance of classification algorithms (Logistic Regression, Support Vector Machines, K-Nearest Neighbors, Multi-level perception, and Random Forest) in the prediction. To enable the analysis, we also created a dataset by mapping over 200,000 vulnerability reports on the CVE website with technical/functional documentation of 3602 products. The preliminary analysis shows that the performance of the documentation-based predictor is comparable or better than the prediction using source code as well as other static analysis methods. Further, identifying which design patterns already exist in source code can help maintenance engineers determine if new requirements can be satisfied. There are current techniques for finding design patterns in source code, but some of these techniques require manually labeling training datasets, or manually specifying rules or queries for each pattern. To address this challenge, we introduce PatternScout, a technique for automatically generating SPARQL queries by parsing UML diagrams of design patterns, ensuring that pattern characteristics are matched. We discuss key concepts and the design of PatternScout. Our results indicate that PatternScout can automatically generate queries for the three types of design patterns (i.e., creational, behavioral, structural), with accuracy that is comparable, or perform better than, existing techniques.

Wednesday, June 2

JIASHUN GOU

Chair: Dr.  Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: Containerization Support for Multi-Agent Spatial Simulation

The Multi-Agent Spatial Simulation (MASS) library is a parallel-computing library designed to execute applications over a cluster of computing nodes. In this capstone project, we aim to improve MASS developers’ experience by adding containerization support to the MASS library and its applications as well as adding a Continuous Integration/Continuous Delivery (CI/CD) pipeline to all related code repositories. First, we designed and implemented containerization support for two versions of MASS libraries and three sample containerized MASS applications. Second, we added a CI/CD pipeline to each code repository of containerized MASS library and MASS applications. Third, we evaluated implementation of the containerized MASS library and applications from five aspects, including reliability, usability, efficiency, maintainability, and portability. In comparison to the original MASS, the containerized MASS library and its applications demonstrates a noticeable increase in usability and maintainability. The project successfully carries out two achievements: (1) containerized MASS and applications provides MASS developers a consistent developing environment and (2) the CI/CD pipeline simplifies MASS developers’ workload, especially testing and releasing procedures.


MANJUSHA KALIDINDI

Chair: Dr. William Erdly
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Virtual reality-based tools for vision therapy (VT) for near-vision disorders

Strabismus (crossed eyes) is one of the most common eye conditions in children. If left untreated, it can lead to Amblyopia, commonly known as lazy eye. To regain binocular vision, a person with strabismus requires training in five levels of fusion skills -- each level indicating progression in ability and vision complexity. The project aims to use virtual reality (VR) to provide an environment for individualized, supervised therapy for children and adults suffering from strabismus to regain binocular vision. The scope of the project is to achieve first two fusion skills --“Luster” and “Simultaneous Perception.” This pilot project is the first of a new toolkit of VR therapy activities built at the UW Bothell EYE Center for Children’s Learning, Vision, and Technologies (the “EYE Center”). The use of model-view-controller (MVC) architectural design with an object-oriented architectural style helped the project achieve simplicity, portability, readability, and expandability. In addition, the project adopted the Agile software development methodology with Scrum and Kanban frameworks, allowing for engaged and accelerated development. Clinician and patient surveys are conducted to evaluate the success of the project. Future steps include designing tools for the remaining three fusion skills as well as completing additional usability and design studies. 

Keywords: Virtual Reality, Vision Therapy, Strabismus, Serious Games

Thursday, June 3

NATHAN RANNO

Chair: Dr. Dong Si
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Enabling Deep Geometric Learning on Cryo-EM Maps Using Neural Representation

Advances in imagery at atomic and near-atomic resolution, such as cryogenic electron microscopy (cryo-EM), have led to an influx of high resolution images of proteins and other macromolecular structures to data banks worldwide. Producing a protein structure from the discrete voxel grid data of cryo-EM maps involves interpolation into the continuous spatial domain. We present a novel data format called the neural cryo-EM map, which is formed from a set of neural networks that accurately parameterize cryo-EM maps and provide native, spatially continuous data for density and gradient. As a case study of this data format, we create graph-based interpretations of high resolution experimental cryo-EM maps. Normalized cryo-EM map values interpolated using the non-linear neural cryo-EM format are more accurate, consistently scoring less than 0.01 mean absolute error, than a conventional tri-linear interpolation, which scores up to 0.12 mean absolute error. Our graph-based interpretations of 115 experimental cryo-EM maps from 1.15 to 4.0 Angstrom resolution provide high coverage of the underlying amino acid residue locations, while accuracy of nodes is correlated with resolution. The nodes of graphs created from atomic resolution maps (higher than 1.6 Angstrom) provide greater than 99% residue coverage as well as 85% full atomic coverage with a mean of than 0.19 Angstrom root mean squared deviation (RMSD). Other graphs have a mean 84% residue coverage with less specificity of the nodes due to experimental noise and differences of density context at lower resolutions. The fully continuous and differentiable nature of the neural cryo-EM map enables the adaptation of the voxel data to alternative data formats, such as a graph that characterizes the atomic locations of the underlying protein or macromolecular structure. Graphs created from atomic resolution maps are superior in finding atom locations and may serve as input to predictive residue classification and structure segmentation methods. This work may be generalized for transforming any 3D grid-based data format into non-linear, continuous, and differentiable format for the downstream geometric deep learning applications.

Friday, June 4

SHAWN QUINN

Chair: Dr.  Clark Olson
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: An Autonomous Mobile Robot Software System: Embedded Real-time Perception and Control, Motion Tracking, Optical Flow, Visual Odometry, and Localization

Advanced robotics development requires software at multiple levels of sophistication and complexity, from low level embedded code up to high level algorithm implementations. Computer vision and image processing play an important role in advanced robotics applications. Object tracking, visual navigation, and simultaneous localization and mapping (SLAM) allow a robot to perceive and move through physical space without prior knowledge of it’s environment and surroundings. Recent advances in microcontrollers and edge computing hardware enable execution of highly complex algorithms in embedded systems using a multiple controller and processor architecture. Small, unmanned ground vehicles (UGV) and unmanned aerial vehicles (UAV), or drones, benefit from these computing advances by executing sophisticated sensing and control code without dependence on offline/cloud-based computing resources. This project focuses on several key areas of the robotics development stack, specifically, Robot Operating System (ROS), real-time operating systems (RTOS), locomotion, cameras, motion tracking via optical flow, and visual odometry and localization. Design and architectural trade-offs are discussed, implementations are presented, and inter-relationships between the various elements are examined.


SOHELI SULTANA

Chair: Dr. Geethapriya Thamilarasu 
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Security and Privacy of Smart Home Using Software Firewall at Home Gateway

The rapid development of Internet of Things technology has led to growing the popularity of smart home applications. Smart homes promise to significantly enhance domestic comfort by minimizing user intervention to monitor and control home appliances. However, use of heterogeneous IoT devices and multiple users in smart home environments gives rise to unique security, privacy and usability challenges. In addition, cloud based IoT architectures also contribute to security and privacy concerns as cloud services do not address attacks at the home gateway. Existing solutions for smart home security often rely on use of network-layer firewalls. However, network-layer firewalls do not provide adequate security as they don’t verify the payload content, potentially resulting in smart home devices being compromised. In this project, we design and develop an application layer software firewall on top of the network-layer firewall to enhance the security of smart home networks. The proposed application layer software firewall runs on the smart home device gateway as an embedded server, monitors all the network traffic and works as a proxy to control access to smart home device networks. Our software firewall solution enables user authority to define firewall rules. This provides a more sophisticated way to control and configure smart home devices protecting them from external attacks. In addition, the software firewall rules are well designed for interactions between and use by multiple people. We simulate various attacks to evaluate the performance of the software firewall rules. Experimental results show that application-layer based firewalls are able to successfully mitigate the attacks and help provide enhanced layer of security in smart home environments.


ANISH PRASAD

Chair: Dr.  Yang Peng
Candidate: Master of Science in Computer Science & Software Engineering
5:45 P.M.; Online
Thesis: A Latency-Aware Provisioning Solution for Mobile Inference Serving at the Edge

With the advancement of machine learning (ML), a growing number of mobile clients rely on ML inference for making time-sensitive and safety-critical decisions. Therefore, the demand for high-quality and low-latency inference services at the network edge has become the key to the modern intelligent society. This paper proposes a novel provisioning solution for reducing the overall latency of mobile inference on edge servers. Unlike existing solutions that either direct inference requests to the nearest edge server or balance the workload between edge servers, the solution we propose provisions each edge server with the optimal type and number of inference serving instances under a holistic consideration of networking, computing, and memory resources. Mobile clients can thus utilize ML inference services on edge servers that offer minimal inference serving latency. The proposed solution has been implemented using TensorFlow Serving and Kubernetes on a cluster of edge servers, including Nvidia Jetson Nano and Jetson Xavier. We demonstrate the proposed solution's effectiveness in reducing the overall inference latency under various system parameters and practical system settings through simulation and testbed experiments, respectively.

WINTER 2021

Friday, March 5

NAVJODH DHILLON

Chair: Dr. Dong Si
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Project: A Machine Learning-based Portable Traumatic Brain Injury Detection System using Raspberry Pi

Traumatic Brain Injury (TBI) is a common cause of death and disability in the United States (U.S.) and the world. However, existing tools for TBI diagnosis are either subjective or require extensive clinical setup and expertise. The increasing affordability and reduction in the size of relatively high-performance compu-ting systems combined with promising results from TBI related machine learning research make it possible to create compact and portable systems for early detection of TBI. This project focuses on using machine learning techniques to auto-matically classify electroencephalography (EEG) signals to identify whether the signal is captured from a Traumatic Brain Injury (TBI) afflicted subject or a healthy (control) subject. The second area of focus is demonstrating that hardware deployment of a pre-trained model to an embedded system (Raspberry Pi) is feasible and comparing deployment configurations. We discuss the design, implementation, and verification of the deployment system that can digitize the EEG signal using an Analog to Digital Converter (ADC) and perform real-time signal classification to detect the presence of TBI. Machine learning techniques used for training classification models and performing classification include Logistic Regression, Random Forest, k-Nearest Neighbors (k-NN), and Decision Tree. A peak classification accuracy of 94.5% at 60 s epoch size was achieved using k-NN. The accuracy of the Raspberry Pi based deployment system was comparable to that of a high-performance desktop computer, to within 0.7 percentage points. The deployment system developed in this work can potentially be used for other types of EEG classification use cases. Such applications can include multi-class classification such as identification of intended limb movements of a subject, and monitoring a subject’s health by analyzing sleep patterns for post-TBI diagnosis rehabilitation. This work can enable the development of systems suitable for field use without requiring specialized medical equipment for early TBI detection applications and TBI research. Further, this work opens avenues to implement connected, real-time TBI related health and wellness monitoring systems.


YUNA GUO

Chair: Dr. Munehiro Fukuda
Candidate: Master of Science in Computer Science & Software Engineering
1:15 P.M.; Online
Project:  Construction of Agent-navigable Data Structure from Input File 

The multi-agent spatial simulation (MASS) library is an agent-based parallelizing library for analyzing structured datsets over a cluster of computing nodes. The current version of MASS library supports distributed multi-dimensional arrays and graphs. In this capstone project, we aim to develop three distributed data structures in MASS, including Continuous Space, Quad Tree and Binary Tree. First, we designed and implemented these three data structures. Second, we used two geometric applications – Closest Pair of Points and Voronoi Diagram to evaluate the programmability and execution performance of Continuous Space and Quad Tree.  Third, we implemented a searching application – Range Search with Binary Tree. Programmability, execution time, and memory consumption are measured for performance evaluation. In comparison to the original MASS or MASS Graph, the programmability result shows that all the three implementations reduced LOC (line of codes) and the number of classes. The performance evaluation shows that all the three implementations reduce execution time and memory consumption for the applications. The project successfully carried out two achievements: (1)  the Continuous Space and Quad Tree facilitate users applying MASS in geometric problem and (2) the Binary Tree allows users to apply logN search and divide-and-conquer algorithm in their applications.  


ANDREW HUNZIKER

Chair: Dr. Yang Peng
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: MLib: A Web Service for Cataloging Machine Learning Inference Results

Machine learning inference results are often buried deep in the results sections of academic papers or presented in industry whitepapers. The siloed nature of these results makes broad comparisons between machine learning frameworks and models difficult. Several sites exist to catalog machine learning inference results, but no existing site provides a one-stop shop for comparing the accuracy, throughput, and memory footprint of models across multiple frameworks and devices. Additionally, existing machine learning benchmarking sites only support uploading results but not running models in situ. Machine Learning Inference Benchmark (MLib) was conceived to address these shortcomings. MLib is a web service that runs user-submitted models in situ against a common benchmark and builds a database of the results. The results database can be queried by users and sorted by accuracy, throughput, memory footprint, framework, and device. Running models in situ ensures that results are as comparable as possible since each inference occurs in the same execution environment and against the same benchmark. Machine learning practitioners embarking on a new project can use the results on MLib to guide their initial framework and model selection.

Thursday, March 11

SNIGDHA SINGH

Chair: Dr. Michael Stiber
Candidate: Master of Science in Computer Science & Software Engineering
11:00 A.M.; Online
Thesis: Brain Graph Analysis For Integrate-and-Fire Neurons With STDP

Many real-world systems can be represented as networks and studied using graph theory. Brain graphs are widely used to analyze brain connectomes using graph theory. Electrophysiological data, tract-tracing and MRI data has been used to extract functional brain graphs. The goal of my thesis is to study spatiotemporal neural dataset as a brain graph, compare the properties of the connectome with random graph models of similar size, and analyze the effect of connections in the graph. Using a simulator solves the problems related to pre-processing, data acquisition and length of time series which exist in extracting braingraphs using other data collection methods. Synaptic plasticity is an important part in the functioning and growth of a neural network. Spike time dependent plasticity (STDP) has emerged as one of the most widely used plasticity mechanisms due to its physiological realistic induction and evidence of its presence in vivo. We have analyzed the effect of different STDP algorithms and parameters to the connections in the graph. In this thesis, the brain graph generated using leaky integrate and fire model, and STDP is studied and compared.

Friday, March 12

ARDALAN AHANCHI

Chair: Dr. Erika Parsons
Candidate: Master of Science in Computer Science & Software Engineering
3:30 P.M.; Online
Project: Oxid OS: A modern educational Kernel in Rust

The kernel is the underlying code that every user program relies on. In modern systems, all the hardware management code is abstracted through the kernel (and the corresponding drivers). It is critical for system’s level software engineers to understand the purpose of the kernel components, their functionality, and implementation fundamentals. The most common open-source kernels (ex. Linux, BSD, Redox, Haiku, etc.) have large and complex code bases. Thus, they are not the most suitable tool for teaching operating system fundamentals. Most educational open-source kernels (ex. Minix, MentOS) target older hardware (68K, x86), or use languages that lack modern features (C). This project presents a modern, well-documented educational kernel written in Rust for the x86_64 platform. Oxid OS provides an educational playground for future kernel developers to experiment with various aspects of an operating system. Some of these aspects are bootstrapping, memory management (paging, allocators), interrupt handling, input/output, and multi-processing. Additionally, it presents a productive kernel development environment for students with comprehensive automated build scripts and debugging tools. Compared to other educational kernels, Oxid OS provides a modern, yet simple, architecture that can be modified or extended upon by future kernel developers.

Back to top

Master of Science in Cybersecurity Engineering

Autumn 2022

Wednesday, November 30

PETER VAN EENOO

Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.; Online
Project: Securing WireGuard with an HSM

WireGuard is a popular, secure, and relatively new VPN implementation that has seen widespread adoption.  WireGuard’s basic key management in the reference implementation leaves some weaknesses that could be  exploited by threat actors to steal keys, compromising a user’s identity or exploit their privileged access. In my project I combined the industry-standard practice of isolating sensitive data with cutting-edge support for Curve25519 keys on an HSM. I created a WireGuard-compatible fork called WireGuard-HSM which uses the PKCS#11 interface to securely access a user’s private key and perform privileged operations on a USB security key. After performing two threat model analysis and comparing the results, I show how my modifications improve the security of the WireGuard system by decreasing the attack surface and mitigating two vulnerabilities, if the host computer is compromised. WireGuard-HSMs security improvements come without a noticeable  performance penalty.

Summer 2022

Thursday, July 14

CHRISTIAN DUNHAM

Chair: Dr. Geethapriya Thamilarasu
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.; Online
Thesis: Adversarial Trained Deep Learning Poisoning Defense: SpaceTime

Smart homes, hospitals, and industrial complexes are increasingly reliant on the Internet of Things (IoT) technology to unlock doors, regulate insulin pumps, or operate critical national infrastructure. While these technologies have made tremendous improvements that were not achievable before IoT, the increased the adoption of IoT has also expanded the attack surface and increased the security risks in these landscapes. Diverse IoT protocols and networks have proliferated allowing these tiny sensors with limited resources to both create new edge
networks and deploy at depth into conventional internet stacks. The diverse nature of the IoT devices and their networks has disrupted traditional security solutions.

Intrusion Detection Systems (IDS) are one security mechanism that must adopt a new paradigm to provide measurable security in this technological evolution. The diverse resource limitations of IoT devices and their enhanced need for data privacy complicates centralized machine learning models used by modern IDS for IoT environments. Federated Learning (FL) has drawn recent interest adapting solutions to meet the requirements of the unevenly distributed nodes in IoT environments. A federated anomaly-based IDS for IoT adapts to the computational restraints, privacy needs, and heterogeneous nature of IoT networks.

However, many recent studies have demonstrated that federated models are vulnerable to poisoning attacks. The goal of this research is to harden the security of federated learning models in IoT environments to poisoning attacks. To the best of our knowledge poisoning defenses do not exist for IoT. Existing solutions to defend against poisoning attacks in other domains commonly utilize different spatial similarity measurements from Euclidean Distance (ED), cosine similarity (CS), and other pairwise measurements to identify poison attacks.

Poisoning attack methodologies have also adapted to IoT causing an evolution that defeats these existing defensive solutions. Poisoning evolution creates a need to develop new defensive methodologies. In this we develop SpaceTime a deep learning recurrent neural network that uses a four-dimensional spacetime manifold to distinguish federated participants. SpaceTime is built upon a time series regression many-to-one architecture to provide an adversarial trained defense for federated learning models. Simulation results shows that SpaceTime exceeds the previous solutions for Byzantine and Sybil label flipping, backdoor,   and distributed backdoor attacks in an IoT environment.

SPRING 2022

Monday, May 16

MATTHEW SELL

Chair: Dr. Marc Dupuis
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.; Online
Project: Designing an Industrial Cybersecurity Program for an Operational Technology Group

The design of a cybersecurity program for an Information Technology (“IT”) group is well documented by a variety of international standards, such as those provided by the U.S. National Institute of Standards and Technology (“NIST”) 800-series Special Publications. However, for those wishing to apply standard information security practices in an Operational Technology (“OT”) environment that supports industrial control and support systems, guidance is seemingly sparse.

For example, a search of a popular online retailer for textbooks on the implementation of an industrial cybersecurity program revealed only seven books dedicated to the subject, with another two acting as “how-to” guides for exploiting vulnerabilities in industrial control systems. Some textbooks cover the high-level topics of developing such a program, but only describe the applicable standards, policies, and tools in abstract terms. It is left as an exercise to the reader to explore these concepts further when developing their own industrial cybersecurity program.

This project expands on the abstract concepts described in textbooks like those previously mentioned by documenting the implementation of an industrial cybersecurity program for a local manufacturing firm. The project started with hardware and software asset inventories, followed by a risk assessment and gap analysis, and then implemented mitigating controls using a combination of manual and automated procedures. Security posture of the OT group was constantly evaluated against corporate security goals, the project-generated risk assessment, and NIST SP800-171 requirements. Improvements in security posture and compliance to corporate requirements were achieved in part through alignment with existing policies and procedures developed by the organization’s IT group, with the balance implemented and documented by the author of this project. The materials generated by this project may be used to assist other organizations starting their journey towards securing their industrial control assets.

Friday, May 20

JAYNIE A. SHORB

Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.; Hybrid (DISC 464 & Online)
Project: Malicious USB Cable Exposer

Universal Serial Bus (USB) cables are ubiquitous with many uses connecting a wide variety of devices such as audio, visual, and data entry systems and charging  batteries. Electronic devices have decreased in size over time and they are now small enough to fit within the housing of a USB connector. There are harmless 100W USB cables with embedded E-marker chips to communicate power delivery for sourcing and sinking current to charge mobile devices quickly. However, some companies have designed malicious hardware implants containing keyloggers and other nefarious programs in an effort to extract data from victims. Any system compromise that can be implemented with a keyboard is possible with vicious implants. This project designs a malicious hardware implant detector by sensing current draw from the USB cable which exposes these insideous designs. The Malicious USB Exposer is a hardware circuit implementation with common USB connectors to plug in the device under test (DUT). It provides power to the DUT and uses a current sensor to determine the current draw from the cable. The output is a red LED bargraph to show if the DUT is compromised. Unless, the DUT contains LEDs internally, any red LED output shows compromise. Active long USB cables intended to drive long distances produce a false positive and are not
supported. The minimum current sensed is 10mA which is outside the range of normal USB cables with LEDs (4-6mA), and E-Marker chips (1mA). Though there is another malicious USB detector on the market it is created by a malicious USB cable supplier and designed to detect their cable. This project provides an open source solution for distinguishing USB cables to uncover a range of compromised cables from different vendors.

Wednesday, May 25

ANKITA CHAKRABORTY

Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
3:30 P.M.; Online
Project: EXPLORING ADVERSARIAL ROBUSTNESS USING TEXTATTACK

Deep neural networks (DNNs) are subject to adversarial examples, that forces deep learning classifiers to make incorrect predictions of the input samples. In the visual domain, these perturbations are typically indistinguishable from human perception, resulting in disagreement between the classification done by people and state-of-the-art models. Small perturbations, on the other hand, are readily perceptible in the natural language domain, and the change of a single word might substantially affect the document's semantics. In our approach, we perform ablation studies to analyze the robustness of various attacks in NLP domain and formulate ways to alter the factor of “Robustness” leading to more diverse adversarial text attacks. This work heavily relies on TextAttack (a Python framework for adversarial attacks, data augmentation, and adversarial training in NLP), for deducing the robustness of various models under attack from pre-existing or fabricated attacks. We offer various strategies to generate adversarial examples on text classification models which are anything but out of-context and unnaturally complex token replacements, easily identifiable by humans. We compare the results of our project with two baselines: Random and Pre-existing recipes. Finally, we conduct human evaluations on thirty-two volunteers with diverse backgrounds to guarantee semantic and grammatical coherence. Our research project proposes three novel attack recipes namely USEHomogyphSwap, InputReductionLeven and CompositeWordSwaps. Not only are these attacks able to reduce the prediction accuracy of current state-of-the-art deep-learning models to 0 % with the least number of queries, but also, they create crafted text that are visually imperceptible to human annotators to a great extent.

Wednesday, June 1

ROCHELLE PALTING

Chair: Dr. Geethapriya Thamilarasu
Candidate: Master of Science in Cybersecurity Engineering
1:15 P.M.; Online
Project: A Methodology for Testing Intrusion Detection Systems for Advanced Persistent Threat Attacks

Advanced Persistent Threats (APTs) are well-resourced, highly-skilled, adaptive, malicious actors who pose a major threat to the security of an organization's critical infrastructure and sensitive data.  An Intrusion Detection System (IDS) is one type of mechanism used in detecting attacks. Testing with a current and realistic intrusion dataset, promptly detecting and correlating malicious behavior at various attack stages, and utilizing relevant metrics are critical in effectively testing an IDS for APT attack detection. Testing with outdated and unrealistic data would yield results unrepresentative of the IDS's detection ability of real-world APT attacks. In this project, we present a testing methodology utilizing our recommended procedure for preparing the intrusion dataset along with recommended evaluation metrics. Our proposed testing methodology incorporates a software program we develop which dynamically retrieves real-world intrusion examples compiled in the MITRE ATT&CK knowledge base, presents the list of known APT tactics and techniques for user selection into their scenario, and exports the attack scenario to an output file consisting of the selected APT tactics and techniques. Our testing methodology, along with attack scenario generator, provide IDS testers with guidance in testing with a current and realistic dataset and with additional evaluation data points to improve their IDS under test. The benefits IDS testers are afforded include time saved in dataset preparation and improved reliability in their IDS APT detection evaluation.

Thursday, June 2

CHRISTOPHER COY

Chair: Dr. Geethapriya Thamilarasu
Candidate: Master of Science in Cybersecurity Engineering
1:15 P.M.; Online
Project: Multi-platform User Activity Digital Forensics Intelligence Collection

In today’s interconnected world, computing devices are employed for all manner of professional and personal activity, from implementing business processes and email communications to online shopping and web browsing.  While most of this activity is legitimate, there are user actions that violate corporate policy or constitute criminal activity, such as clicking a link in a phishing email or downloading child sexual abuse material.

When a user is suspected of violating policies or law, a digital forensic analyst is typically brought in to investigate the traces of user activity on a system in an effort to confirm or refute the suspected activity.

Digital forensics analysts need the capability to quickly and easily collect and process key user activity artifacts that enable rapid analysis and swift decision making. The FORINT project was developed to provide digital forensics analysts with this very capability across multiple operating systems.


SARIKA RAMESH BHARAMBE

Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.; Online
Project: New Approach towards Self-destruction of Data in Cloud

One of the most pressing issues faced by cloud service industry is ensuring data privacy and security. Dealing with data in a cloud environment that leverages shared resources, as well as offering reliable and secure cloud services, necessitates a strong encryption solution that has no or minimal performance impact. One of the approaches towards this issue is to introduce self-destruction of data which mainly aims at protecting the shared data. Encrypting files is a simple way to protect personal or commercial data. Using a hybrid RSA AES algorithm, we propose a time-based self-destruction method to address the above difficulties and improve file encryption performance and security using file split functionality. Each data owner must set an expiration limit on the contents for collaboration which will initialize after uploading file to the cloud. Once a user-specified expiration period has passed, the sensitive information is securely self-destructed.

In this approach we have introduced on how to use channels on clouds which will help increase the data security as we split the bits of each word and upload it in encrypted format. For this purpose, we are using ThingSpeak, a cloud platform used for visualization, analyzation and sharing data through public and private channels. We experimentally test the performance overhead of our approach with ThingSpeak and use realistic tests to demonstrate the viability of our solution for enhancing the security of cloud-based data storage. For encryption and decryption technique we have used Hybrid RSA AES algorithm. Through results of various experiments performed, we can conclude that this algorithm has higher efficiency, increased accuracy, better performance, and security benefits.

WINTER 2022

Friday, March 11

KRATICA RASTOGI

Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.; Online
Project: Fully Homomorphic Encryption for Privacy-Preserving Fingerprint Screening

In many applications, fingerprint-based authentication has recently gained traction as a viable alternative to traditional password or token-based authentication, owing to user ease and the uniqueness of fingerprint features. Biometric template data, on the other hand, is considered sensitive information because it uniquely links to a user's identity. As a result, it must be secured to avoid data leakage. In this research, a fingerprint authentication system based on fully homomorphic encryption for access control is proposed which protect private fingerprint template data. Fingerprint data can be matched in the encrypted domain using fully homomorphic encryption, making it more difficult for attackers to retrieve the original biometric template without knowing the private key. The authors propose a proof-of-concept implementation based on the SEAL library, which includes the fundamental basic operations required to perform fingerprint matching. For a 10-bit integer, the suggested system can achieve fingerprint matching in 0.074 seconds. 

AUTUMN 2021

Thursday, December 9

NHU LY

Chair: Dr. Marc Dupuis
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.; Online
Project: The impact of cultural factors, trait affect, and risk perception on the spread of COVID-19 misinformation on social media

In the information age, social media has created a favorable environment for spreading misinformation. This has become a serious concern for public health, especially during the unpredictable developments of COVID-19 and its variants. Misinformation related to COVID-19 has spread faster than ever on social media, emerging in numerous types from its origin and treatment methods to vaccines. This results in social media users making dangerous health decisions, as well as impacting the pandemic control process of authorities. Despite the warnings from accredited organizations about the risks of misinformation, a majority of social media users still trust such misinformation and spread it without verifying the reliability of the content. To tackle the spread of COVID-19 misinformation, this paper aims to explore what factors influence users' beliefs in such misinformation and their tendencies to share it on social media. We conduct a survey to assess users' reactions and behaviors toward COVID-19 misinformation. Memes and additional measures are included in the survey to evaluate the roles of culture, trait affect, and risk perception on the spread of misinformation. 

SUMMER 2021

Thursday, August 12

ERIC HULDERSON

Chair: Dr. Brent Lagesse
Candidate: Master of Science in Cybersecurity Engineering
1:15 P.M.; Online
Thesis: Adversarial Example Resistant Hyperparameters In Deep Learning Networks

Deep learning has continued to make significant progress across many modalities of machine learning, and it is increasingly making life-critical systems and applications its beneficiary.  Companies from the IT industry such as Amazon and Google as well auto manufacturers like Mercedes and Tesla are heavily utilizing deep learning to enrich autonomous vehicle capabilities.  Further, deep learning has made its way into the healthcare industry with applications in imaging analytics and diagnostics, electronic health records analysis, and development of precision medication.  With deep learning at the center of these life-critical applications, safety and security must be a foundational component of the technology as it moves towards mainstream adoption.  Research has shown that adversarial training can improve the robustness of deep learning models, but at the cost of accuracy.  Moreover, it has been shown that hyperparameter selection can influence the resiliency of neural networks, although data supporting this is limited.  In this work, we expand the research of intelligent hyperparameter selection by incorporating deep learning architectures ResNet50 and VGG16 while also examining the relationship between robustness and accuracy.  

SPRING 2021

Monday, May 17

AVANTIKA AGARWAL

Chair: Dr. Marc Dupuis
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.; Online
Project: Comparison of E2EE group chats provided by various communication platforms and implementing RBAC for E2EE group chats

Currently, there are quite a few communication platforms that people all over the world use such as WhatsApp, Signal, Zoom and Google Meet which provide End-to-End Encryption (E2EE) messaging solutions. Given the variety of communication platforms providing encrypted communication solutions, it is crucial to analyze the details of how E2EE works on these communication platforms especially group messaging. The author conducts a literature review and performs practical investigations of encryption strategies used by these communication platforms. The practical investigations are carried out by packet dissection and network analysis of encrypted group messaging traffic. This paper also deep dives into one specific aspect of group messaging – group management. Group messaging brings about interesting challenges with group management for large groups. As the number of people in such groups grow, it may not be easy for a limited set of administrators to carry out all administrative actions. To solve this challenge, this paper proposes a solution to perform group management through Role-based access control (RBAC) for E2EE groups. To demonstrate this protocol, the author has implemented it as an Android application based on top of Signal SDK. This paper also conducts a security and network analysis of the proposed solution.

Thursday, May 20

CHRISTOPHER IJAMS

Chair: Dr. Marc Dupuis
Candidate: Master of Science in Cybersecurity Engineering
11:00 A.M.; Online
Project: Ethical Penetration Test for AAA Washington

Penetration testing is a type of ethical hacking in which an organization hires a skilled professional to find and exploit vulnerabilities on their network. With the continued rise of cyberattacks, modern best practices indicate that vulnerability scanning and penetration testing are essential for an organization to maintain a secure posture. To remain PCI-DSS compliant, organizations acting as a payment gateway must regularly execute penetration tests on their infrastructure. AAA Washington has expressed a need for an external penetration test on their internet-facing resources. This project sought to perform and document such a test for the organization while establishing a repeatable process for future work. The project identified and exploited vulnerabilities and weak configurations within assets owned by AAA Washington. A methodology tailored explicitly for external penetration testing was established during this process. The test documented here emphasizes interacting with hardened, internet-facing resources and a rigorous inspection of web applications. This project ends with a redacted six-chaptered penetration test report outlining all findings and recommendations for remediation. 


PRINCETON SEE

Chair: Dr. Marc Dupuis
Candidate: Master of Science in Cybersecurity Engineering
1:15 P.M.; Online
Project: Control Gap Analysis for AAA Washington

Every organization with data to protect needs to ensure that they have controls in place to mitigate or minimize cyber threats and risks. Due to the evolving nature of the cybersecurity threat landscape, a yearly risk assessment is crucial for keeping up to date with the latest attacks. As part of a larger risk assessment, the control gap analysis allows an organization to perform a detailed breakdown of how the controls in place measure up to commonplace standards. AAA Washington plans to migrate into the hybrid-cloud environment and has requested for a control gap analysis to be conducted on their organization. This project used the CIS Top 20 Control Standards as the base of the gap analysis and will also devise a theoretical risk model to assist in standardizing the current risks to the organization. The goal of the project is the creation of a risk assessment document that is accepted by AAA Washington and used as its reference for future years. The successful implementation of the theoretical risk model may see it adopted for use in yearly risk assessments.

Back to top

Master of Science in Electrical Engineering

Summer 2022

Tuesday, August 9

MOOSA RAZA

Chair: Dr. Seungkeun Choi
Candidate: Master of Science in Electrical Engineering
8:45 A.M.; Online
Thesis: Multilevel Resistive Switching in a Metal Oxide Semiconductor based on MoO3

Over the years a resistive random-access memory (ReRAM) has received so much attention among many emerging memory technologies due to its simple structure and fabrication process, cost-effective development, low-power consumption, scalability, high throughput, and other attracting memory characteristics.

Multilevel switching operation of the stack ReRAM device based on the MoO3 has been investigated by the compliance current control method, where the device exhibited 2-bit per cell memory storage density. Device realized the bipolar resistive switching mode with the varying levels of high resistive state (HRS)  between 11.7 Ω to 90 Ω and low resistive state (LRS) between 3.89 Ω to 47 Ω read at 0.01 V during the endurance characteristics, exhibiting the variable OFF/ON ratio between 1.6 to 15. The device also showed insignificant variations in the switching voltages such as set voltage (Vset) between 0.22 V to 0.27 V and reset voltage  (Vreset) between 0.15 V to 0.30 V was observed over 11 resistive switching cycles, when swept between -0.5 V to +0.5 V.

Also, the unique resistive switching behavior of the novel lateral ReRAM device based on MoO3 has been reported, showing multiple set and reset voltages in both the positive and negative voltage regimes and maintained the consistency across the switching voltages i.e., Vset A, Vset B, Vset C, Vreset A and Vreset B were noticed around -40 V, 40 V, -10 V, 40 V and -40 V respectively throughout the 105 switching cycles. The device also exhibits the self-compliance property at much smaller currents around few microamperes (≅ 0.9 μA), making it suitable for the wide range of power applications. Further investigation is required to determine the plausible applications of the unique resistive switching properties achieved from the lateral ReRAM

AUTUMN 2021

Friday, December 3

SHARMILA DEVI KANNIVELU

Chair: Dr. Sunwoong S. Kim 
Candidate: Master of Science in Electrical Engineering
8:45 A.M.; Online
Thesis: Privacy-Preserving Image Filtering and Thresholding Using Numerical Methods for Homomorphically Encrypted Numbers

Homomorphic encryption (HE) is an important cryptographic technique that allows one to directly perform computation on encrypted data without decryption. In HE-based applications using digital images, a user often encrypts a private image captured on a local device. This image can contain noise that negatively affects the results of HE-based applications. To solve this problem, this thesis paper proposes a HE-based locally adaptive Wiener filter (HELAWF). For small-sized encrypted input data, pixels that have no dependency when sliding a window are encoded into the same ciphertext. For division in the adaptive filter, which is not supported by conventional HE schemes, a numerical approach is adopted. Image thresholding is a method of segmenting a region of interest and is used in many real-world applications. Typically, image thresholding contains a comparison operation, but this operation is not supported in conventional HE schemes. To solve this problem, a numerical approach for comparison operation is used in the proposed HE-based image thresholding (HETH). The proposed HELAWF and HETH designs are integrated and implemented as a proof-of-concept client-server model. In practical HE schemes, the number of consecutive multiplications on encrypted data is limited. Therefore, the number of iterations of the numerical methods used in the integrated design is carefully chosen. To the best of the authors’ knowledge, this thesis paper is the first work that applies approximate division and comparison operation over encrypted data to image processing algorithms. The proposed solutions can address important privacy issues in image processing applications in internet-of-things and cyber-physical systems, where many devices are connected through a vulnerable network.

SPRING 2021

Tuesday, June 1

COURTNEY CHAN CHHENG

Chair: Dr. Denise Wilson
Candidate: Master of Science in Electrical Engineering
5:45 P.M.; Online
Thesis: Abnormal Gait Detection using Wearable Hall-Effect Sensors

Abnormalities and irregularities in walking (gait) are predictors and indicators of both disease and injury. Gait has traditionally been monitored and analyzed in clinical settings using complex video (camera-based) systems, pressure mats, or a combination thereof. Wearable gait sensors offer the opportunity to collect data in natural settings and to complement data collected in clin-ical settings, thereby offering the potential to improve quality of care and diagnosis for those whose gait varies from healthy patterns of movement. This paper presents a gait monitoring system designed to be worn on the inner knee or upper thigh. It consists of low-power Hall-effect sensors positioned on one leg and a compact magnet positioned on the opposite leg. Wireless data collected from the sensor system were used to analyze stride width, stride width variability, cadence, and cadence variability for four different individuals engaged in normal gait, two types of abnormal gait, and two types of irregular gait. Using leg gap variability as a proxy for stride width variability, 81% of abnormal or irregular strides were accurately identified as different from normal stride. Cadence was surprisingly 100% accurate in identifying strides which strayed from normal, but variability in cadence provided no useful information. This highly sensitive, non-contact Hall-effect sensing method for gait monitoring offers the possibility for detecting visually imperceptible gait variability in natural settings. These nuanced changes in gait are valuable for predicting early stages of disease and also for indicating progress in recovering from injury.

WINTER 2021

Friday, March 12

RUOHAO “EDDIE” LI

Chair: Dr. Kaibao Nie
Candidate: Master of Science in Electrical Engineering
11:00 A.M.; Online
Thesis: Improving Keywords Spotting in Noise with Augmented Dataset from Vocoded Speech and Speech Denoising

As more electronic devices have an on-device Keywords Spotting (KWS) system, producing and deploying trained models for keyword(s) detection is becoming more demanding. The dataset preparation process is one of the most challenging and tedious tasks in Keywords Spotting. It requires a significant amount of time to obtain raw or segmented audio speeches. In this thesis, we first proposed a data augmentation strategy using a speech vocoder to generate vocoded speech at different numbers of channels artificially. Such a strategy can increase the dataset size by at least two-fold, depending on the use case. With the new features introduced by the different number of channels of the vocoded speeches, a convolutional neural network (CNN) KWS system trained with the augmented dataset from vocoded speech showed promising improvement evaluated at +10 dB SNR noisy condition. The same results were confirmed in hardware implementation and proved using vocoded speech in data augmentation is the potential to improve KWS on microcontrollers. We further proposed a neural-network-based speech denoising system using the Weighted Overlap-Add (WOLA) algorithm for feature extraction for more efficient processing. The proposed speech denoising system uses regression between a noisy speech and a clean speech and converts noisy speech (as input) into clean speech (as output). Thus, the input of the proposed KWS system will be relatively clean speech. Furthermore, by changing the training target to vocoded speech, such a speech denoising system can convert noisy speech (as input) into vocoded speech (as output). The combination of speech denoising and vocoded speech in data augmentation achieved relatively high accuracy when evaluated at +10 dB SNR noisy condition.

SPRING 2019

Friday, June 7

FEIFAN LAI

Chair: Dr. Kaibao Nie
Candidate: Master of Science in Electrical Engineering
11:00 A.M.; DISC 464
Thesis: Intelligent background sound event detection and classification based on WOLA spectral analysis in hearing devices

Audio signals from real-life hearing devices typically contain background noises. The purpose of this thesis is to build a system model which can automatically separate background noise from noisy speech, and then classify background sound into predefined event categories. This thesis proposed to use weighted overlap-add algorithm (WOLA) for feature extraction and feed-forward neural network for sound event detection. In this approach, an energy signal trough detection algorithm is used to separate out speech gaps which primarily contain background noise. To further analyze the noise signal’s spectrum, the WOLA algorithm is used to extract spectral features by transforming a fraction of time domain signal into frequency domain data represented in 22 channels. Moreover, a feed-forward neural network with one hidden layer is used to recognize each event’s diverse spectral feature pattern. It then produces classification decisions based on confidence values. Recordings of 11 realistic background noise scenes (cafe, station, hallway …), mixed with human speech at Signal to Noise Ratio (SNR) of 5 dB, are used for training. The neural network will learn the mapping between spectral feature characteristics and sound event categories. After training, the neural network classifier is evaluated by measuring the accuracy of event classification. The overall detection accuracy has achieved 96%, while the event ‘hallway’ has the lowest detection rate at 85%. This detection algorithm also has the ability for improving noise reduction in hearing devices by applying distinct compensation gains, which will attenuate the noise dominated frequency bands for each particular predefined event. In our preliminary evaluation experiment, the application of gain patterns has been proven to be effective in reducing background noise. Its combinational usage with instant gain pattern would produce improved results with noticeably attenuated noise and smooth spectral cues in the processed audio output.  

SUMMER 2018

Friday, August 3

MALIA STEWARD

Chair: Dr. Seungkeun Choi
Candidate: Master of Science in Electrical Engineering
3:30 P.M.; DISC 464
Thesis: Development of Corrugated Wrinkle Surface for an Organic Solar Cell

There have been great interest in organic photovoltaics (OPVs) due to their potential for the development of low-cost, high throughput, and large-area solar cells with a flexible form factor. Hence, the power conversion efficiency of OPVs has been dramatically improved for the past two decades. Although the power conversion efficiency (PCE) of OPVs exceeds 10% now, the PCE of this thin-film based solar cells is fundamentally limited by the ability of the photo-active layer to absorb the incident sunlight. The external quantum efficiency (EQE) is used to describe this ability and rarely exceeds 70% for the state-of-the-art OPVs, implying that only 70% of incident photons contributes to a photo-current generation. The EQE can be improved by trapping more light in the active layer which is very challenging for thin-film based photovoltaics.

In this research, I have investigated optimization of the organic solar cell fabrication by tuning a charge carrier transport layer and also developed a new metallization method in order to replace vacuum deposited silver electrode with electroplated copper which is less expensive and better fits to the industry manufacturing. I also investigated a number of methods to fabricate optimum wrinkle structure that can be used as a light trapping vehicle for organic solar cells. I fabricated wrinkles on SU-8 polymer by controlling softness of the SU-8. While wrinkles generally produced after metal deposition, I found that more suitable wrinkle profile can be fabricated before the metal deposition. Future work will focus on the development of reproducible, scalable, and high throughput wrinkle fabrication with an optimum profile and the demonstration of highly efficient organic solar cells by enhancing light trapping thanks to the wrinkles.

Back to top