SimuScan and Large-Area AFM: Toward Autonomous Nanoscale Discovery through Synthetic Data and Machine Learning

R. Millan‐Solsona, M. Checa, L. Collins
Oak Ridge National Laboratory,
United States

Keywords: AFM, large-area AFM, SimuScan, synthetic data, machine learning

Summary:

Atomic Force Microscopy (AFM) is a fundamental technique for exploring biological and material systems at the nanoscale, from individual biomolecules to complex biofilms. However, its potential for discovery is limited by the need for expert operation and by the restricted area covered in conventional acquisitions. The Large-Area AFM framework1 established a scalable approach for automated mosaicking and quantitative analysis of extended biological landscapes, enabling the capture of hundreds of stitched images with submicron precision. Building on this foundation, SimuScan2 advances toward full autonomy by combining synthetic data generation, deep-learning-based segmentation, and adaptive AFM control. SimuScan produces realistic AFM datasets of DNA origami structures, bacterial morphologies, and heterogeneous biofilms, incorporating typical experimental artifacts such as tip convolution, adhesion zones, noise, and AFM image post-processing distortions. These automatically labeled synthetic datasets enable the training of models such as YOLOv8, U-Net, and Mask R-CNN, capable of identifying and quantifying nanoscale features directly and in real time from acquired topographic images, allowing in situ decision-making to explore different objects in the most efficient way. Integrated with adaptive scanning, the trained vision system can autonomously explore large-area mosaics, identify regions of interest, and re-scan them at higher resolution—linking the capabilities of Large-Area AFM with the intelligence of SimuScan. Together, these approaches transform AFM into a data-driven autonomous platform for biological discovery, bridging imaging, machine learning, and synthetic data to accelerate nanoscale insight. By merging automation, artificial intelligence, and synthetic realism, this combined framework democratizes access to advanced AFM workflows, reduces operator bias, and opens the path to rapid phenotyping, biomaterial evaluation, and the dynamic exploration of living interfaces across unprecedented scales. 1 Millan-Solsona, R. et al. Analysis of biofilm assembly by large area automated AFM. npj Biofilms and Microbiomes 11, 75 (2025). 2 Millan-Solsona, R. et al. SimuScan: Label-Free Deep Learning for Autonomous AFM. Research Square 1, doi:doi:10.21203/rs.3.rs-7724735/v1 (2025).