top of page
  • Linkedin
  • GitHub

It's not about reinventing the wheel, it's about understanding how it spins

  • Writer: Tommaso Pardi
    Tommaso Pardi
  • Feb 28
  • 3 min read

Updated: Mar 1

Learning things from scratch is the best way to understand them

Why Bother with the Hard Way?

In the fast-paced world of AI, pre-made libraries and tools are like jetpacks—strap them on, and you’re soaring through development in no time. PyTorch, TensorFlow, Hugging Face—they’re the giants whose shoulders we stand on, and thank goodness for them. But here’s the rub: if you want to master a topic, not just use it, there’s no shortcut. Mastery comes from diving into the muck, dissecting the tiniest details, and building it all back up yourself. It’s slow, it’s messy, but incredibly satisfying.

My week-long adventure this time? Rebuilding a U-Net from scratch. Let me tell you how it went.


The U-Net Diaries: A Week of Enlightenment

U-Net is a classic deep-learning architecture for image segmentation, and yes, there are a dozen high-quality implementations available online. But instead of copy-pasting a ready-made version, I decided to start from absolute zero. No shortcuts. No black-box magic. Just raw tensor manipulation and a deep dive into every convolutional layer.

The first few hours were filled with that nagging thought: Why am I doing this to myself? But then, as I started wiring up the encoder-decoder structure and manually debugging my skip connections, something clicked.

I wasn’t just using U-Net. I was understanding it.



U-Net Architecture with attention layers and skip-connections
The U-Net architecture


Peeling Back the Layers (Literally)

Once I had a working version, I began experimenting with architectures:

  • ResNets as encoders? Yep, that changed how feature maps carried information forward.

  • Attention mechanisms? Interesting trade-offs in performance and computational cost.

  • Depth vs. efficiency? Tuning the number of downsampling layers made it clear where the bottlenecks were.


By tweaking each component manually, I could see how data flowed through the network, where bottlenecks formed, and how to manipulate tensors effectively. Things that would’ve been abstract concepts in a research paper became painfully obvious when I had to make them work from scratch.


Why This Matters

There’s a reason the best engineers in any field tinker with the fundamentals. Sure, abstraction is great, but when something breaks in production, it’s the low-level understanding that saves the day. Knowing how things are built gives you superpowers when debugging, optimizing, or innovating.

For instance, while implementing my own U-Net, I realized how crucial data preprocessing was. By manually defining transformations, normalizations, and augmentations, I had complete control over how input data affected my model. Instead of blindly trusting a library’s defaults, I could tweak and optimize with intent.


The Repo: A Work in Progress

I’ve dumped all my findings (code, notes, and insights) into a GitHub repo (https://github.com/pardi/ddpm), which I’ll keep updating as I test new architectures and optimizations. If you're curious to see how a U-Net is built from scratch, explore alternative architectures, or even contribute your own ideas, check it out here:

🔗 GitHub Repo: My U-Net from Scratch

It’s an alive repository, meaning it will keep evolving with new learnings, tweaks, and better ways to approach segmentation tasks. Feel free to star it, fork it, or just browse through my mistakes. 😆


Final Thoughts

Using pre-built tools is fine. Understanding them is better. If you ever feel like you’re just a hyperparameter tweaker instead of an engineer, take a step back and build something from scratch. It’s humbling, frustrating, and incredibly rewarding.

Now, onto the next challenge—maybe a Transformer from scratch? 🤔


 
 
 

Comments


CONTACT

© 2025 beyond

bottom of page