

2D RNA Folding ML Class Competetion
Built a DeepResUNet-Transformer hybrid from scratch, achieving the highest F1 score in a class of PhD and Master's students.
Overview
Course project for COEN 432 (Evolutionary Algorithms and Machine Learning) that turned into one of my biggest academic wins. The task: predict how RNA sequences fold in 3D space — which nucleotides pair with which.
Result: Highest F1 score in the entire class.
The class was full of PhD and Master's students. I'm an undergraduate. I beat them all — and not by a small margin.
The Challenge
Given an RNA sequence (a string of A, U, G, C nucleotides), predict its 3D folding pattern. Specifically: output a contact map showing which bases pair with each other.
This is a real bioinformatics problem. RNA structure determines function, and accurate prediction has implications for drug design, genetic research, and understanding disease mechanisms.
My Approach
Instead of using an off-the-shelf architecture, I built a hybrid model from scratch:
DeepResUNet-Transformer
| Component | Purpose |
|---|---|
| ResNet backbone | Extract hierarchical features from sequence data |
| U-Net architecture | Encoder-decoder with skip connections for spatial precision |
| Custom Transformer | Capture long-range dependencies between nucleotides |
I implemented the transformer attention mechanism myself — not imported from a library, but built from scratch to understand exactly how it works.
The Learning Process
This wasn't just about winning. I spent immense time learning:
- How U-Net works (and why skip connections matter)
- How ResNet's residual connections enable deep networks
- How transformers capture relationships across long sequences
- Training dynamics, hyperparameter sweeps, and debugging loss curves
Competition Format
This was structured as a Kaggle competition with:
- Public leaderboard: Feedback during development
- Private leaderboard: Final evaluation (hidden test set)
I had the highest score on both.
When I asked the professor for feedback, he confirmed: "You have the highest score by far, by a big margin."
Why Solo?
The project allowed teams of two. I chose to work alone — not because I don't like collaboration, but because I wanted to learn everything deeply. Every architecture choice, every training run, every failure was mine to understand.
Technical Details
- Framework: PyTorch
- Training: Multiple architecture experiments, hyperparameter sweeps
- Metrics: F1 score optimization for imbalanced base-pair prediction
- Hardware: GPU training with mixed precision (FP16)
What This Proved
I can compete with graduate students in ML when I put in the work. The key wasn't being smarter — it was investing the time to truly understand what I was building instead of copy-pasting someone else's solution.
Gallery





