Classifying Optical Microscope Images of Exfoliated Graphene Flakes By

www.nature.com/npj2dmaterials ARTICLE OPEN Classifying optical microscope images of exfoliated graphene flakes by data-driven machine learning Satoru Masubuchi1 and Tomoki Machida1 Machine-learning techniques enable recognition of a wide range of images, complementing human intelligence. Since the advent of exfoliated graphene on SiO2/Si substrates, identification of graphene has relied on imaging by optical microscopy. Here, we develop a data-driven clustering analysis method to automatically identify the position, shape, and thickness of graphene flakes from optical microscope images of exfoliated graphene on an SiO2/Si substrate. Application of the extraction algorithm to optical images yielded optical and morphology feature values for the regions surrounded by the flake edges. The feature values formed discrete clusters in the optical feature space, which were derived from 1-, 2-, 3-, and 4-layer graphene. The cluster centers are detected by the unsupervised machine-learning algorithm, enabling highly accurate classification of monolayer, bilayer, and trilayer graphene. The analysis can be applied to a range of substrates with differing SiO2 thicknesses. npj 2D Materials and Applications (2019) 3:4 ; https://doi.org/10.1038/s41699-018-0084-0 INTRODUCTION classification algorithm for exfoliated graphene flakes using Machine learning enables the extraction of meaningful patterns data-driven analysis and machine-learning algorithms. from examples,1 which is one of the traits that characterize human In general, machine learning algorithms are classified into two 1,19 intelligence. Attempts to automate scientific discovery using categories: supervised learning, in which a human operator machine learning have been rapidly increasing in number,2 and annotates a part of the data to train the models, thereby 20 their advantages have been demonstrated in various research constructing a model that classifies the rest of the data, or fields such as biology,3–5 medicine,6,7 and physics.8–12 Across all unsupervised learning, in which patterns in the data such as the domains, the success of machine-learning methods has been most clusters are detected from the unlabeled data. Since the advent of 13–15 18,21 fi prominent in image analysis. In the context of cell biology, monolayer graphene on SiO2/Si substrates, the identi cation automated optical microscopy has been acquiring high-quality process of the graphene flakes has relied on the recognition of 16 cellular images on an unprecedented scale, thus processing graphene on SiO2/Si substrate by the human operator. The fl large-scale image collections and enabling researchers to auto- parameter tuning process for detecting graphene akes on SiO2/Si matically identify cellular phenotypes.3 In medical image proces- substrates was necessary for the automated searching system of 17 sing, deep-learning technologies are opening up new avenues for graphene flakes. By developing a method based on unsuper- the automated diagnostics of various types of diseases.7 vised learning, it can eliminate the identification process, In research fields concerning two-dimensional (2D) materials, ambiguity in the layer thickness determination, and the parameter the recent development of robotic automation has enabled the tuning process. autonomous search for 2D crystals and subsequent assembly of In this report, we present an unsupervised data-driven analysis the 2D crystals into van der Waals superlattices.17 The robotic method to classify graphene layers using a large set of optical 4 assembly of van der Waals heterostructures has suggested the microscope images. By processing ~7 × 10 images of SiO2/Si possibility for high-throughput production of complicated van der substrates with exfoliated graphene, we extracted ~4 × 105 Waals heterostructures. Besides these functionalities, the system segmented regions enclosed by edges. The optical and morpho- provides an opportunity for collecting optical microscope images logical feature values were extracted from the segmented regions. of the exfoliated 2D crystals on SiO2/Si substrates in a large scale. Plotting the data points in the hue (H), saturation (S), and value (V) When graphene flakes are exfoliated onto SiO2/Si substrates using space resulted in discrete clusters. Analyzing the data points the scotch tape method,18 the graphene flakes of various revealed that the clusters are derived from n-layered graphene, = thicknesses and shapes are randomly distributed over the SiO2/ where n 1, 2, 3, and so on. An unsupervised machine-learning Si substrates. If the images of the entire region of SiO2/Si clustering algorithm based on a non-parametric Bayesian mixture substrates with exfoliated graphene are collected, the image model can detect cluster centers, and they can classify graphene dataset must contain a sufficiently large number of n-layered layers with high accuracy (>95%). These results demonstrate that graphene flakes, where n = 1, 2, 3, and so on. Therefore, a large set identification of graphene layers can be performed by computa- 1,19 of SiO2/Si images can provide an unprecedented level of tional methods, thus introducing new routes for utilizing data- information for developing automated segmentation and a driven methods for the identification of graphene layers and determination of the layer thickness. 1Institute of Industrial Science, University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8505, Japan Correspondence: Satoru Masubuchi ([email protected]) or Tomoki Machida ([email protected]) Received: 10 September 2018 Accepted: 18 December 2018 Published in partnership with FCT NOVA with the support of E-MRS Satoru Masubuchi and Tomoki Machida 2 a 1. Capture images 2. Extract feature values 3. Clustering analysis Bilateral filter Canny edge detection — ——— Edge closing Feature Y ———— Feature Z Fill up SiO /Si ~50 2 ~ 1×108 features Feature X Regions captured by values Extract feature values optical microscope 1234567890():,; Fig. 1 Data-driven analysis workflow and raw scatter plots of extracted feature values. a Schematic of data analysis workflow. First, the optical microscope images of the SiO2/Si substrate are acquired using the automated optical microscope. The acquired images are fed into the feature extraction pipeline. The segmented region surrounded by the optical edges are applied to the image, generating ~50 feature values from each region (Supplementary Table 1). The extracted feature values are analyzed by the data analysis methods. Since each data point is linked to the optical microscope images, the corresponding images can be investigated in the preceding process. b–d The scatter plots of the feature values: b Sarea vs. IH, c Sarea vs. IS−BG, and d Sarea vs. IV−BG RESULTS color value of the monolayer, bilayer, and trilayer graphene flakes Label-free analysis workflow in advance. By utilizing the extracted regions as the masks to the fi fl fi images, ~50 types of feature values were extracted from each The rst step in the work ow of automated graphene classi ca- 23 tion was to capture optical microscope images of entire exfoliated region (summarized in Supplementary Table 1). The extracted feature values were categorized into three classes: morphology, graphene flakes on the SiO /Si substrate (Fig. 1a). Twenty-four 2 optical intensity, or positions. By applying the data analysis SiO /Si substrates with the size of 1 × 1 cm2 with exfoliated 2 pipeline to the ~7 × 104 images of SiO /Si with exfoliated graphene flakes were tiled onto the silicon chip tray and were 2 graphene layers, ~105 regions were detected and ~107 feature scanned by the automated optical microscope.17 Two types of values were extracted. At least 3–4 SiO2/Si substrates were SiO2/Si substrates with different SiO2 thicknesses of d ~ 290 and 22 required to perform the clustering analysis (Supplementary note 85 nm were used to demonstrate the generality of the image 1). We analyzed the feature values using the data analytics analysis pipeline. The images were acquired using a CMOS camera frameworks Pandas, Matplotlib, Jupyter notebook, and Scikit-learn capable of capturing images in high bit depth (12-bit), which (see Methods). enables accurate detection of the edge patterns. The data were Figure 1b–d show the scatter plots of the area Sarea and the stored in a large-scale storage, comprising a large number of raw optical feature values extracted from the regions in the HSV color optical microscope images of ~1 TB. The images were then space IH, IS−BG, and IV−BG. As the values of optical features, we processed by the feature extraction algorithm (see Methods and utilized the medians of the optical intensities in the HSV color Supplementary Figure 1). The algorithm detects the regions space IH, IS, and IV, respectively. Here, in order to compensate for enclosed by the edges. The third panel of Fig. 1a shows the the unequal illumination of the field, we extracted the medians of representative output of the segmentation algorithm applied to the optical intensities of the bare SiO2/Si substrates without fl the image containing exfoliated graphene akes. The segmented exfoliated graphene IS,BG and IV,BG, and the background values areas were indicated by the colored regions in the third panel of were subtracted for the S and V color spaces as IS−BG = IS − IS,BG Fig. 1a. The process was performed without assuming the specific and IV−BG = IV − IV,BG. The background subtraction was not applied npj 2D Materials and Applications (2019) 4 Published in partnership with FCT NOVA with the support of E-MRS Satoru Masubuchi and Tomoki Machida 3 a A–E in Fig. 2a. These clusters emerged simply on applying the d = 290 nm feature extraction algorithm to the large set of optical microscope images. Therefore, the presented clustering results are expected to be robust against the changes in the experimental parameters. To explore the generality of the image analysis pipeline, we applied the same algorithm to the optical microscope images of SiO2/Si substrates with a thickness d = 85 nm.

Load more