U-Net

U-Net is a <a href="/facts/Convolutional_neural_network/kHEdnmGU">convolutional neural network</a> that was developed for <a href="/facts/Image_segmentation/jofAhbxa">image segmentation</a>. The network is based on a fully <a href="/facts/Convolutional_neural_network/kHEdnmGU">convolutional neural network</a> whose architecture was modified and extended to work with fewer training images and to yield more precise <a href="/facts/Image_segmentation/jofAhbxa">segmentation</a>. Segmentation of a 512 × 512 image takes less than a second on a modern (2015) <a href="/facts/Graphics_processing_unit/PTK1RQVp">GPU</a> using the U-Net architecture.
The U-Net architecture has also been employed in <a href="/facts/Diffusion_models/ay78DDLn">diffusion models</a> for iterative image denoising. This technology underlies many modern image generation models, such as <a href="/facts/DALL-E/rkWL8P49">DALL-E</a>, <a href="/facts/Midjourney/ETtS6Nhm">Midjourney</a>, and <a href="/facts/Stable_Diffusion/BaXu5Vc8">Stable Diffusion</a>.
U-Net is also being explored for <a href="/facts/Language_models/FntSpg0j">language models</a>. <a href="/facts/Large_language_model/WnogWVJY">Tokenization</a> is not a separate step, allowing the model to more easily understand spelling and concurrently <a href="/facts/Vectorization/PzfAQRv9">vectorizing</a> / tokenizing higher level concepts.

U-Net open-in-new

U-Net