Translation of Cellular-Resolution OCT Images to H&E-Like Stained Images via Generative Adversarial Network

S.T. Tsai, C.H. Liu, S.L. Huang
National Taiwan University,

Keywords: biomedical imaging, image translation, optical coherence tomography


Hematoxylin and eosin (H&E) stain has been essential to visualize various tissue types and morphologic changes for over a century. It allows the display of a broad range of cytoplasmic, nuclear, and extracellular matrix features. H&E staining is considered the gold standard for skin histopathological analysis, allowing pathologists to diagnose skin structure accurately. Cellular-resolution optical coherence tomography (CR-OCT) has advanced rapidly in recent years. It enables non-invasive imaging for pathologists to access cellular-level images of in vivo human tissues, which could help unveil functions of living organisms and facilitate clinical disease/cancer diagnosis in the early stage. However, realizing the mapping between two visual domains with high accuracy is challenging. The main challenge is that aligned image pairs for training are either difficult to collect or unavailable. Another challenge is that the mapping may be inherently one-to-many; a single input may correspond to multiple possible outputs. A few methods have been developed to address these challenges. For example, designed for training with unpaired data, CycleGAN uses cycle consistency and bidirectional learning to constrain the output to match the input one-to-one. Virtual staining based on conditional GAN for ex vivo tissue microscopic photos/auto-fluorescence images may create H&E-like stained images in a few seconds. However, it isn’t easy in many other applications to collect thousands of paired data required to train conditional GAN models. In this study, we propose an image-to-image translation model based on the architecture of CycleGAN to convert in vivo tomographic images of human skin to H&E-like stained images. It is functionally similar to pseudo coloring, coloring, or colorization in image processing. A key feature is that it exploits segmentation information during training, significantly improving image translation’s precision. Physicians annotated the lower boundary of the stratum corneum (SC) and dermal-epidermal junction (DEJ) in the CR-OCT and H&E-like stained images. Our proposed model was trained on two datasets: One is a CR-OCT dataset, and the other is an H&E stained dataset. Our custom-made CR-OCT system has a near-isotropic 0.9-μm spatial resolution. After depth scanning, a 3D tomogram with a volume of 291.6 x 219.6 x 100 μm3 (648 x 488 x 500 pixels) was collected. This CR-OCT system collected 4,622 in vivo human skin cross-sectional images from five volunteers. A total of 204 images, each with 2,448×2,048 pixels, were acquired from 170 slices provided by National Taiwan University Hospital. We used pre-trained segmentation models to introduce biomedical knowledge into the proposed image-to-image translation network. This model can stain images based on the texture information of the input images. Nuclei and various skin borders, critical information for pathologists, can be converted to and visualized in the H&E domain. The model enhances the image specificity of SC, DEJ, and nuclei for diagnosing skin tissue disorders. It is helpful for clinicians to read the CR-OCT images effectively. This work represents a critical step towards realizing in vivo cellular-resolution imaging with real-time staining.