Intro

Hi! This is Yuan Tang, Naomi. I am currently a Master’s student at Carnegie Mellon University Department of Electrical and Computer Engineering. I have wide interests in machine learning and software engineering.



Education

Carnegie Mellon University

Master of Science in Electrical and Computer Engineering

2021 - 2022

Courses:

  1. Foundations of Computer Systems
  2. Distributed Systems
  3. Machine Learning for Signal Processing
  4. Software Construction
  5. Information Theory
  6. Foundations of Privacy

Nanyang Technological University

Bachelor of Engineering in Electrical and Electronic Engineering, with a minor in Mathematics

2017 - 2021

Relevant Courses:

  1. Data Structures / Algorithms / Advanced Algorithms
  2. Machine Learning / Data Mining / Artificial Intelligence
  3. Database Systems
  4. Web Application Design
  5. Computer Architecture
  6. My statistics minor - Probability / Statistics / Regression Analysis
  7. All sorts of other math classes - Linear Algebra / Calculus / Discrete Math

I have also spent time in University of Wisconsin, Madison for a semester exchange and University of California, Los Angeles for a summer study. Shout out to those GPA-free days.



Projects

Machine Learning based Medical Image Analysis

A*STAR - Agency for Science, Technology and Research

  1. Developed a generative adversarial segmentation network that applies to organ semantic segmentation on images on different body parts, such as chest x-rays, achieved 2% higher IoU than state-of-the-art; also experimented with various U-Net based segmentation networks cross different modalities, such as prostate MRI and liver CT.
  2. Constructed a chest x-ray pneumonia detection algorithm that achieved accuracy over 0.96 while generating a heatmap of infected regions; the model was deployed at a local hospital for COVID-19 screening.

HPC Enabled CNN in Recognition of Hand Motion

NTU School of Computer Science and Engineering

  1. Implemented a region ensemble network to predict 3D positions of 21 hand joints, with the aim to assess the risk of rheumatoid arthritis by measuring and displaying the angle and speed of hand motions; the method serves as a cheaper and more convenient alternative to x-ray.
  2. Created an application in OpenCV and Python to visualize the hand motions in real time that enabled interactive usage.


Internships

Machine Learning Engineer Intern

Alibaba Group, Hangzhou

Jul 2021 - Aug 2021

  1. Developed an industry-level recommendation model with spatial-temporal sequence modeling that improved the existing model by 2% in click through rate prediction AUC; the model will be deployed on a food delivery platform with over 10 million daily active users.
  2. Initiated the spatial-temporal modeling project from problem identification, data analysis, research to development.

Data Scientist Intern

Shopee, Singapore

Jun 2020 - Aug 2020

  1. Enhanced the existing machine translation model by developing a class-embedded transformer that allowed additional categorical information during training, which achieved an average of 0.58 BLEU score improvement in a multilingual translation task.
  2. Developed an optimized multilingual sub-word tokenization model that reduced memory and time consumption for corpus pre-processing, which improved the baseline Chinese-English machine translation model by 2 BLEU scores.

Machine Learning Engineer Intern

Hikvision, Hangzhou

Jun 2019 - Aug 2019

  1. Built a multi-label sentiment classification model using ELMo embedding and TextCNN model, which achieved accuracies over 0.9 on 17 labels from e-commerce website customer reviews; the sentiment information was used directly by the product team to analyze customer feedback.
  2. Investigated different text data augmentation techniques, word embedding algorithms and model structures to best boost the performance of the text classification task.

Data Analyst Intern

ViSenze, Singapore

Mar 2019 - May 2019

  1. Automated the process to create standardized deep learning image datasets for non-technical staff to preprocess data; reduced the preprocessing time from 2 hours to 1 minute.
  2. Examined the performance of classification models by analyzing metrics and edge cases to explain flaws and propose improvements.
  3. Analyzed user query logs from major clients to investigate API usage pattern and summarize user behavior; the statistics were presented to management for service improvements.