EUNICE CHOI

Hi, I'm Eunice!

I'm a junior (Class of 2026) studying computer science at Massachusetts Institute of Technology.
My interests include software development, artificial intelligence, machine learning, computer vision, and natural language processing, and data analytics.

The leading principles of my life are: collaboration, creativity, proactivity, curiosity, and impact, and I am always looking for new opportunities to grow and learn

Experiences

Burmester & Vogel

link

Python

Optical Character Recognition

Natural Language Processing

Pytorch

Tensorflow

sci-kit

numpy

unit testing (pytest)

Azure Cloud

Elastic Search

Langgraph

Rest APIs

CI/CD

Postman

For the past year, I've been working at Burmester & Vogel as one of 6 AI software developer interns.

Burmester & Vogel is a startup specializing in developing software to digitalize the maritime shipping industry. Prior to our work, the industry was heavily reliant on paper documents and manual review, leading to slow turnaround, human error, and high costs.

Our work ultimately seeks to digitalize supply chains for hundreds of clients in 50+ countries, reducing delays, optimizing resource use, and minimizing environmental impact through smarter, data-driven logistics.

❤️ My Experience:

Having joined the team in its early stages, I've been able to contribute to the development of the company's core products and see the direct impact of our work on the maritime industry. I've learned a lot about developing industry-grade software, working in a team, and the importance of data-driven decision-making. I love onboarding new interns and helping them learn the ropes because it reminds me of how lost I was when I first started and how much I've grown in the past year.

💻 The How:

Digitizing and classifying 2k+ unstructured handwritten documents using Optical Character Recognition (OCR) and Natural Language Processing (NLP) to analyze numerical/semantic differences and merge documents
Developing the backend of a patent-pending, interactive visualization tool for the maritime shipping industry to create audit reports in pdf and web formats
Creating predictive models utilizing Pytorch, Tensorflow, sci-kit, and numpy to forecast trends in cargo ship loading efficiency
Implemented custom search querying across 1k+ historical documents and books via elastic search and langgraph
Spearheading backend unit testing with pytest and Postman tests for Rest API endpoints to integrate testing seamlessly into the CI/CD pipeline
Accelerated bug tracking by automating health checks and user-encountered errors with Trello for over 15 Azure functions, streamlining Agile workflows and accelerating issue resolution.
Integrated Azure Cloud for scalable data storage

🌍 The Impact:

the digitalization reduces time spent on manual review by up to 90%, saving users hundreds of hours each month
the audit reports reduces human error by 75% and streamlines contract disputes, saving companies +$100,000 per document
the predictive models are available to users in a data dashboard, allowing users to optimize shipping delay costs, allowing potential savings of +$1,000,000 per year
the unit tests and automated Trello workflow currently achieves 100% API test coverage, reduces bugs pre-deployment by 75%, and accelerates issue resolution by 80%

MIT Computer Science and Artificial Intelligence Lab

link

Python

Natural Language Processing

Deep Learning

Graph Embeddings

HuggingFace

Tensorflow

Pytorch

pandas

numpy

Matplotlib

Matlab

langchain

audio diarization

voice/face prints

I worked as a research assistant at the MIT Computer Science and Artificial Intelligence Lab (CSAIL) as a part of the Mantis Web Team under Professor Manolis Kellis.

In our contemporary world, there is so much complex, unstructured, relational data that is difficult to represent and visualize in a way that is intuitive to the human brain.

Our team developed a graph embeddings visualization software for complex datasets including research papers, songs, videos, git logs, news stories, and documents to allow users to query, explore, and curate thousands of data points based on meaning, not just numbers

❤️ My Experience:

This was my first experience working in a research lab, and I learned a lot about the importance of collaboration and communication in a team as well as the importance of setting your ego aside and not being afraid of asking questions. I was just a freshman at this point so I spent a large part of this experience asking for help. I had never written deployable code or collaborated on the same codebase so it was a lot of adjusting, but I learned a lot and am grateful for the opportunity to contribute to the development of a software that has the potential to revolutionize the way we analyze complex datasets.

💻 The How:

Used Large Language Models (LLMs) and Word2Vec to generate embeddings of complex, unstructured data
Fine-tuned clustering algorithms to cluster thousands of datapoints by topological proximity based on semantic, numerical, temporal, categorical, and hierarchical relations using Natural Language Processing and Matplotlib
Used numpy, pandas, and Matlab to refine dimensionality reduction algorithms to transform the relations between datapoints into a 2D or 3D space
Used Python ML libraries such as HuggingFace, Tensorflow, and Pytorch to build a pipeline that incoporates user annotations on datapoints through incremental learning.
Utilized time-series analysis and forecasting techniques to identify emerging trends and patterns over time, detecting seasonal/cyclical patterns and correlations within clusters in the datasets.
Cleaned dozens of datasets to extract voice/face prints and perform audio diarization using PyAnnote
Contributed to implementing search querying across ontologies, topics, entities, and metadata of datasets using langchain
Secured over +$2M in funding and currently awaiting patent approval.

🌍 The Impact:

Although it is still in the beta phase, it eliminates the bottlenecks of traditional, manual ontology creation, enabling dynamic, real-time adaptation to evolving knowledge domains
Automates data annotation and organization, enabling organizations to make data-driven decisions faster and smarter

📸 The Look:

MIT Computer Science and Artificial Intelligence Lab

MIT Department of Urban Studies and Planning

full-stack development

end-to-end development

NextJS

ReactJS

Typescript

Javascript

Matlab

REST APIs

I worked as a research assistant at the MIT Department of Urban Studies and Planning in developing a data analysis platform assessing water affordability.

Data-driven decisions can be one of the most empowering tools for policy-makers, but the data is often complex and difficult to interpret. The platform we developed aimed to simplify the data analysis process for utility companies to make informed policy decisions regarding water affordability to assess which areas are experiencing the most water shut-offs.

❤️ My Experience:

This was my first time independently working on the full-stack development of a platform that is used by real people. I learned a lot about the importance of user experience and how to design a platform that is intuitive and easy to use for all ages. I was able to see the direct impact of our work on the utility companies and the people they serve, and it was a very rewarding experience.

💻 The How:

Spearheaded the full-stack development of a data analysis platform utilizing NextJS, ReactJS, Typescript, and Javascript
Integrated Matlab and R-scripts performing advanced analysis of geospatial and socioeconomic factors from the US Census Bureau and the American Community Survey

🌍 The Impact:

helped 200+ utility companies make informed policy decisions that will help millions of people across the nation get access to clean water

Some things I've been up to lately...

RecipeBytes

link

source code

Python

Typescript/Javascript

Computer Vision

Natural Language Processing

AWS (EC2, Lambda, RDS, S3)

Next.js (Node.js, React.js)

Flask

MySQL

Tired of constantly pausing the tiktok to catch the recipe?

I've developed a webapp allowing users to extract and save recipes from short-form videos like Instagram reels using computer vision, NLP, and scraping.

❤️ My Experience:

💻 The How:

I used OpenCV and pytesseract to track hand movements and extract text instructions from the screen, utilized Open AI's whisper module to transcribe voiceovers, and scraped captions and comments to extract full recipes from videos.
I then integrated custom NLP models such as NER (Named Entity Recognition), CRF (Conditional Random Field), and sequence classification using BERT-based models from HuggingFace to parse the extracted recipes.
I use MySQL to allow users to come back to saved recipes in their accounts
For the front-end, I used Typescript through Next.js and React.js and I used Flask for the backend.
To deploy everything, I used AWS: web server on EC2, database on RDS, ML models stored on S3, and AWS Lambda to call the models
Due to limited resources, I am currently working on a more efficient way to deploy the computer vision models, but the basic functionalities are deployed.

AI Chatbot for Cleft Patient Healthcare

end-to-end development

Java

Kotlin

Apache Spark

Apache Kafka

Pillow/PIL

Computer Vision (OpenCV)

Pytorch

Natural Language Processing

spacy

Redis

Postgres

Twilio API

I worked as a developer for the Global Smile Foundation, where I led a team of 4 peer developers and consulted with multiple healthcare professionals to develop an AI chat app that connects 1,000+ cleft palate patients across 12 countries in Latin America to healthcare providers during the post-operation recovery period.

Some key features include real-time patient-professional connections, patient data querying, symptom image recognition, and an AI chatbot supporting 3 languages for cleft patients to allow patients in remote areas to ask questions and access information and healthcare after receiving surgery.

❤️ My Experience:

This was one of my most rewarding experiences learning about end-to-end development with a relatively smaller team. MIT's Code for Good is structured in a way where there are no technical mentors, but we all learn from each other instead. A huge challenge in this project was regarding security since we would be dealing with real patient healthcare data, but with the help of the technicians at the Global Smile Foundation, we were able to find a secure way to store patient data. Seeing the technology we built remove barriers and directly impact real people in such a tangible way reinforced why I do what I do.

💻 The How:

We developed a chatbot app for androids using Kotlin (Java) to provide information about cleft conditions and answer common questions.
Integrated Apache Spark to process the large dataset of patient information
Used Postgres to store patient data and allow for querying and used Redis for caching
Utilized Apache Kafka to facilitate calls/chats between patients and healthcare providers in real-time
Developed and trained a symptom image recongition model using Pillow/PIL, OpenCV, and Pytorch
Utilized spacy to train our chatbot in 3 languages (English, Portuguese, and Spanish)
Integrated the chat into Whatsapp using the Twilio API to allow users to use technology that they are already familiar with

🌍 The Impact:

Many of these patients lived in remote areas with limited access to healthcare, often left with questions after receiving cleft-palate surgery.
Since a significant portion of our patients are children, who may not be able to verbally express their discomfort or symptoms, our symptom image recognition helps better assess their condition and provide more accurate care

Skills

Languages: Python, JavaScript, TypeScript, Java, Kotlin, R, C, C++, C#, SQL, Matlab

Databases: MySQL, Postgres SQL, NoSQL (MongoDB)

Frameworks/Libraries: NodeJS/Express/Next, React.js, Flask, Django, Angular, Vue, Prisma

Machine Learning & Data Science: PyTorch, Tensorflow, HuggingFace, OCR (Optical Character Recognition), NLP (Natural Language Processing, LLMs, spacy), computer vision (OpenCV, Pillow/PIL), langchain/langgraph, sci-kit, matplotlib, pandas, numpy, Apache Spark

Software Development Practices/Tools: Git version control, Docker, Azure, AWS, REST APIs, Apache Spark, Redis, Full-stack development (front-end + back-end), end-to-end development, Android mobile development, CI/CD, pytest (unit testing), OOP (Object Oriented Programming), Agile methodology (Trello, Jira)

About Me

I'm currently a junior pursuing a Bachelor of Science (B.S.) in Computer Science and Engineering and concentrating in Writing at the Massachusetts Institute of Technology (MIT). I was born in California, raised in Vegas, and am now based in Boston. I'm naturally curious and am super excited to learn new skills to solve complex problems.

Things I'm learning right now: iOS mobile development, building a paper trading bot, sewing

Things I want to learn more about: Go, Three.js, Springboot, woodworking, metalworking, embedded engineering, literature, and golfing

Activities

🤺

I'm on the MIT varsity fencing team and have been fencing epee since high school

🧑‍🏫

I'm a lab assistant for MIT's 6.1900 (Intro to Low-Level Programming in C and Assembly) course

💜

I'm a consultant for MIT's Code For Good club

🐉

I've been on the exec board for MIT's Asian American Association since freshman year and currently serve as treasurer.

💻

I'm a web developer for MIT's Educational Studies Program

🍳

I love trying out new recipes, restaurants, and cafes

Current Favorites

Books: Tuesdays with Morrie by Mitch Albom, The Idiot by Elif Batuman, and The Stranger by Albert Camus

Shows: Arcane, Culinary Class Wars, Cunk on Earth, and Brooklyn Nine-Nine

Movies: Interstellar, Hereditary, Parasite

Artists: Mitski, Beabadoobee, Big Thief, and Laufey

EXPERIENCE
PROJECTS
SKILLS
ABOUT ME