Issue 
Natl Sci Open
Volume 3, Number 2, 2024
Special Topic: AI for Chemistry



Article Number  20230055  
Number of page(s)  15  
Section  Chemistry  
DOI  https://doi.org/10.1360/nso/20230055  
Published online  01 February 2024 
RESEARCH ARTICLE
A machinelearningenabled approach for bridging multiscale simulations of CNTs/PDMS composites
^{1}
Department of Chemical Engineering, Tsinghua University, Beijing 100084, China
^{2}
School of Textile Science and Engineering, Xi’an Polytechnic University, Xi’an 710048, China
^{3}
State Key Laboratory of Intelligent Textile Material and Products, Xi’an Polytechnic University, Xi’an 710048, China
^{4}
Materials Genome Institute of Shanghai University, Shanghai University, Shanghai 201900, China
^{*} Corresponding author (email: wangxiaonan@tsinghua.edu.cn)
Received:
6
September
2023
Revised:
7
January
2024
Accepted:
8
January
2024
Benefitting from the interlaced networking structure of carbon nanotubes (CNTs), the composites of CNTs/polydimethylsiloxane (PDMS) have found extensive applications in wearable electronics. While hierarchical multiscale simulation frameworks exist to optimize the structure parameters, their wide applications were hindered by the high computational cost. In this study, a machine learning model based on the artificial neural networks (ANN) embedded graph attention network, termed as AGAT, was proposed. The datasets collected from the microscale and the macroscale simulations are utilized to train the model. The ANN layer within the model framework is trained to pass the information from microscale to macroscale, while the whole model is aimed to predict the electromechanical behavior of the CNTs/PDMS composites. By comparing the AGAT model with the original multiscale simulation results, the datadriven strategy is shown to be promising with high accuracy, demonstrating the potential of the machinelearningenabled approach for the structure optimization of CNTbased composites.
Key words: multiscale simulation / machine learning / material property prediction / CNTs/PDMS composites
© The Author(s) 2024. Published by Science Press and EDP Sciences.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
INTRODUCTION
In contrast to rigid devices, flexible electronic devices exhibit excellent adaptability on unconventional interfaces, particularly the surface of human skin, and thus can be widely used in various scenarios, such as human motion recognition, heath monitoring, and humanmachine interface [1,2]. Among them, the remarkable electrical and mechanical properties of carbon nanotubes (CNTs), together with the outstanding stretchability and flexibility of polydimethylsiloxane (PDMS), make the CNTs/PDMS composite a kind of suitable strain sensor for numerous sensing applications [3–5].
The conductivity of the CNTs/PDMS composite benefits from the interlaced networking structure of CNTs, while the response of the CNTs/PDMS sensor primarily arises from changes in its electrically conductive network under material deformation caused by external forces [6]. Hence, the microstructure, including the geometrical morphology of CNTs, the volume ratio, and the distribution of CNTs within the matrix, significantly influence the properties of the CNTs/PDMS composite. Therefore, the structureproperty relations in CNTs/PDMS composites have been widely investigated [7]. Among them, the numerical simulation methods have gained popularity as they offer an efficient alternative to timeconsuming experimental research [8]. Arora and Pathak [9] proposed an efficient computational methodology to predict effective orthotropic elastic properties of CNTpolymer composites at diverse constituent conditions. In this work, the MoriTanaka homogenization scheme has been implemented with a finite element method (FEM) approach to predict the material properties of nanocomposites. Zhang et al. [10,11] studied the mechanical properties of 3D braided composites using FEM analysis. Li et al. [12] designed simulation models through changing the content, aspect ratio and orientation degree of CNTs to investigate the electrical conductivity of CNTs/PE composites with different mesostructures.
It is worth mentioning that the property of a single CNT which offers the essential parameters in the FEM model can be obtained by using the microscale calculation. Numerous studies have explored how the mechanical and electrical properties of CNTs are influenced by their geometric morphology. Ebbesen et al. [13] believed that the electronic properties of CNTs were strongly modulated by their small structural variations, and measured the electrical properties of different nanotubes with diverse lengths and radii. Bao et al. [14] simulated the Young’s moduli of CNTs based on molecular dynamics (MD) simulation, while Wagner et al. [15] investigated the piezoresistive effect of CNTs within density functional theory (DFT). Wei et al. [16] presented a method for measuring the natural frequencies (f) as a function of the length (L) of individual CNTs, and the axial Young’s moduli and radial shear moduli of the CNTs were obtained simultaneously through fitting the experimental fL data using the Timoshenko beam model. PeraltaInga et al. [17] proposed a density functional tightbinding selfconsistent charge approach to study the elastic properties of nine CNTs of different helicities and diameters.
The above reviews clearly indicate that for improving the performance of CNTs/PDMS composites, the insights from the microscale to macroscale are necessary. In recent decades, the concept of multiscale modelling has emerged to describe procedures that seek to simulate continuumscale behavior using information gleaned from computational models of finer scales in the system, rather than resorting to empirical constitutive models [18,19]. Multiscale approaches are particularly attractive in the CNTsbased composite structure due to atomic scale dependencies of a single CNT [4]. Multiscale methods have been mainly categorized into two classes: hierarchical and concurrent multiscale methods [20]. In hierarchical approaches, the molecular and macro models are simulated sequentially. To be specific, in the CNTs/polymer system, the molecular model can be utilized to calculate effective material properties, which are then passed to the FEM model to simulate the material properties on a macroscale. Subramanian et al. [21] proposed an automictically informed stochastic multiscale model to predict the behavior of CNTenhanced nanocomposites. In this work, the MD simulations were performed to study subnanoscale interactions of the CNT with the polymeric phase of the nanocomposite. Jiang et al. [22] developed a predictive constitutive model using a hierarchical multiscale approach based on molecular dynamics and the generalized interpolation material point method. The MD simulations were used to construct an elastodamage model, which was subsequently incorporated into material point methods for largescale simulations.
However, the employment of molecular models within the FEM presents a significant computational burden, and designing the microstructure required to achieve the desired macroscopic set of properties is often intractable, due to the multiple optimization objectives, the highdimensional and multiscale optimization space, the presence of nonlinear, stochastic, and multiphysics interactions, and the lack of governing equations for the macroscale behavior [23]. Fortunately, machine learning (ML) is an emerging field that provides highdimensional, datadriven modeling that can map nonlinear correlations and therefore can be tailored to material discovery and optimization. ML uses a range of statistical and probabilistic approaches, allowing machine intelligence to learn from experience and to identify the hidden patterns (inputoutput correlations) from large and often noisy datasets, which are now seen as successful approaches for the design and discovery of new materials for a wide variety of applications [24]. Recently, ML has emerged as a powerful technique in the field of composite material [25–27]. Yuan et al. [28] proposed an axial elastic modulus degradation prediction method of [0_{m}/90_{n}]_{s} crossply laminates using an ML model. The dataset of the ML model was established based on the published experimental data and a small amount of FEM results. Liu et al. [29] proposed a hybrid ML method to predict the macroscopic thermal conductivity of CNTsreinforced polymeric nanocomposites. Huang et al. [30] proposed a predictive model assisted by ML techniques, including artificial neural network (ANN) and support vector machine (SVM), to map the relationship between the mechanical properties of CNTsreinforced cement composites and multiple influential factors. Le [31] developed a quick and robust computational tool based on the ML Gaussian process regression model to predict the tensile strength of CNT/polymer nanocomposites.
However, the aforementioned work mainly focused on utilizing MLguided design approaches for predicting properties of composite materials on the macroscale. Recently, several studies have been published using ML in multiscale modeling and simulation [20]. Xiao et al. [20,32] proposed an MLenhanced hierarchical multiscale approach based on the dataset generated from both the MD simulations and the continuum model to study the mechanical behaviors of materials at the macroscale. Matouš et al. [33] outlooked in a review that the ML methods that seek meaningful lowdimensional structures hidden in highdimensional multiscale data (both computational and experimental) will be important for a variety of tasks. Meanwhile, Fish et al. [18] also claimed that datadriven and ML tools have great potential to accelerate materials discovery by combining with physicsbased multiscale methods. They gave an example to support their view that a recent study utilized Gaussian process metamodels informed from systematically coarsegrained MD simulations to discover optimal mechanical properties of polymergrafted nanoparticle assemblies. Once trained and validated, such surrogate models can rapidly generate new data points by interpolating simulated outcomes, while sensitivity analyses easily reveal parameters that matter the most [34].
Based on the above literature, this study developed an MLenabled model that offers a multiscale approach for predicting the sensing behavior of composite sensing materials, as illustrated in Figure 1. Firstly, a multiscale electromechanical framework was proposed for modeling the electrical resistance change of the CNTs/PDMS composite under material deformation. This framework includes both the micro and macroscales for both mechanical and electrical domains involved. Within this framework, the DFT calculations were initially conducted to determine the Young’s modulus of an individual CNT. The result was then passed to the FEM simulation for predicting the macroscale sensing behaviors on the mesostructures of CNTs/PDMS composites. Subsequently, an ML model, named AGAT, was established based on the architecture of ANNs embedded graph attention networks (GAT). The embedded ANN layer, utilized for predicting the Young’s modulus from the length and radius of the CNT, acted as a bridge between the DFT calculation and the FEM simulation, referred to as the DFTFEM (DF) module. The dataset collected from multiscale simulations and the literature was utilized to train AGAT. The welltrained ML model was finally used to predict the material sensing property, with the mesostructure of the CNT/PDMS composite serving as the input parameters.
Figure 1 The diagram of the MLenabled multiscale approach for predicting the sensing behavior of CNTs/PDMS composites. 
METHODS
The microscale calculation
The purpose of microscale calculation is to generate data samples that calculate the Young’s modulus based on the length and radius of a single CNT. The DFT calculations were performed using the VASP code. The PerdewBurkeErnzerhof functional within generalized gradient approximation was used to process the exchangecorrelation, while the projector augmentedwave pseudopotential was applied with a kinetic energy cutoff of 500 eV, which was utilized to describe the expansion of the electronic eigenfunctions. The vacuum thickness was set to be 25 Å to minimize interlayer interactions. The Brillouinzone integration was sampled by a Γcentered 5 × 5 × 1 MonkhorstPack kpoint. All atomic positions were fully relaxed until the energy and force reached a tolerance of 1 × 10^{−5} eV and 0.03 eV/Å, respectively. The dispersioncorrected DFTD method was employed to consider the longrange interaction.
The macroscale simulation
In the macroscale simulation, the FEM was employed to generate the dataset by simulating the mechanicalelectric response of CNTs/PDMS composites at diverse mesostructures. The simulation was started by constructing the geometric structure of the CNTs/PDMS composite based on four changeable parameters, namely the CNT radius, the CNT length, the volume ratio, and the CNT quantity. Establishing a comprehensive macroscale geometry model for the composite is challenging, as it leads to a significant computational burden due to the complex microstructure. Hence, to enhance computational efficiency, the representative volume elements (RVEs) with dimensions of 5 μm × 5 μm × 5 μm size were utilized to establish the geometry model for property evaluation. The RVE generation algorithm was developed using Python 3.6 program with the ABAQUS software. The shape of the CNT filler was treated as the cylinder and the PDMS was regarded as the matrix in the RVE model. The proposed generation algorithm, which operates within the boundaries of the RVE matrix, was used to build the fillers based on given input parameters, namely, the CNT radius, the CNT length, the volume ratio, and the CNT quantity. Owing to the van der Waals forces, CNTs within the matrix do not physically overlap or interlace. Thus, an avoidance algorithm was devised to mimic this realworld structural characteristic. The flowchart of RVE generation algorithm is depicted in Figure 2. Take the tth CNT as an example. Firstly, the seed point of the CNT is created randomly within the matrix boundary, denoted as ${A}_{0}^{t}$ = (x_{0}, y_{0}, z_{0}); then the axis of the CNT extends from the starting point ${A}_{0}^{t}$ to the given length; each incremental growth step is recorded, resulting in the axis of the CNT being expressed as the set Path^{t} = {${A}_{0}^{t}$_{,}${A}_{1}^{t}$_{, …,}${A}_{n}^{t}$}, where n is in accordance with the length of the CNT. Whenever a new CNT is created, the avoidance algorithm is triggered. The overlap between two CNTs can be calculated by Eq. (1)
Figure 2 The flowchart of the RVE generation algorithm. 
$\begin{array}{c}\text{Overlap}(Pat{h}^{q},Pat{h}^{t})=\mathrm{min}(\left{A}_{i}^{t}{A}_{j}^{q}\right,{A}_{j}^{q}\in Pat{h}^{q},{A}_{i}^{t}\in Pat{h}^{t}),\end{array}$(1)
where Path^{q} refers to the axis set of the qth CNT. The tth CNT and qth CNT are considered overlapping when the value Overlap exceeds double of the given radius. Upon iterating through all existing CNTs, any overlapping with the newly created tth CNT would lead to its removal. The iterative creation of CNTs continues until the quantity achieves the set value and the RVE model is finally established.
To explore the impact of the generated mesostructured configurations on the sensing property of CNTs/PDMS composites, a finite element analysis was undertaken. This analysis aimed to calculate the electrical resistance under different strain levels of 0%, 4%, 8%, 12%, 16%, and 20%, with a range of diverse RVE models employed. Taking the advantages of COMSOL in multiphysics calculation, the preestablished RVE model was imported into the COMSOL Multiphysics software for the mechanicalelectrical stimulation. Notably, a tuning effect would be triggered when the distance of two individual CNTs is less than 1 nm, as depicted in Figure 3A. The simulation model was operated under the structural mechanical and AC/DC coupled module interface. Within the structural mechanics module, two matrix interfaces along the zaxis, namely the upper interface and the bottom interface, were chosen and assigned fixed constraint condition and prescribed displacement condition, respectively. The strain orientation of material was prescribed along the zaxis direction. As illustrated in Figure 3B, in the AC/DC modulus, the upper surface was designated as the voltage entry point which was set at 1 V, while the bottom surface was selected as the voltage outlet, configured with the ground boundary condition. In terms of meshing, the substantial number of CNT fillers results in a sharp increase in the degree of freedom for the RVE model’s grid, subsequently reducing the model’s computational efficiency. Hence, to enhance the computational efficiency, coarser grids were selected within the CNTs domain. The material parameters are shown in Table 1.
Figure 3 The FEM simulation of the CNTs/PDMS composite. (A) Tunnel effect; (B) boundary conditions; (C) distributions of electrical potential under different strains. 
Material parameters of the CNTs/PDMS composite
The resistance R of the CNTs/PDMS composite is calculated according to Eqs. (2) and (3).
$\begin{array}{c}I={\displaystyle \int {\displaystyle \underset{S}{\int}Jn\text{d}s=}}{\displaystyle \int {\displaystyle \underset{S}{\int}\sigma \nabla \phi n\text{d}s}}={\displaystyle \sum _{i=1}^{Ns}\sigma i}\frac{\partial \phi}{\partial n}Si\text{},\end{array}$(2)
$\begin{array}{c}R=\frac{U}{I},\end{array}$(3)
where n is the mesh number of the voltage applied interface, which is the upper face in this work, J is the current density, S_{i} is the area of the meshI, U refers to the voltage value, and I denotes the current value. The electrical potential distributions of the CNTs/PDMS composite under different strains are shown in Figure 3C.
Machine learning
The architecture of AGAT is displayed in Figure 4. The overall architecture comprises the DF module and the GAT module. The embedded DF module is formed using an ANN to predict the Young’s modulus of an individual CNT and expand the 4node inputs to 5node inputs. Subsequently, the expanded input undergoes five layers of graph attention. In each attention layer, an 8head attention mechanism is employed to update the information associated with each vertex by aggregating information from adjacent vertices with specific weights. The resulting outputs then pass through another graph attention layer with a singlehead attention to obtain the final predictive values.
Figure 4 The architecture of AGAT. (A) The DF module constructed using ANN networks for predicting the Young’s modulus of an individual CNT; (B) integration of the value generated by the DF module with the original input; (C) configuration of the GAT with initial five layers of GAT convolutions, each featuring an 8head attention mechanism, and a final layer with a singlehead attention mechanism. 
As shown in Figure 4A, the embedded DF module, representing the microscale CNT property, is built with the ANN. In AGAT, the CNTs/PDMS composite sample can be represented by an input H= (h_{0}, h_{1},…, h_{n−1}) (as shown in Figure 4B). Two vertexes out of H, assuming (h_{2}, h_{3}), were initially sent to the DF module for predicting the Young’s modulus of a single CNT, denoted by h_{n}. The DF module is composed of an input layer, five hidden layers, and an output layer, each of which is denoted by Z^{i}. The input layer consists of 2 nodes with the normalized values, the hidden layers both consist of 8, 32, 64, 64, and 32 nodes, respectively, and the output layer has 1 node with the predicted value. The process of transmitting information among nodes is presented in Eq. (4).
$\begin{array}{c}{Z}^{i}=({w}^{i}{Z}^{i1}+{b}^{i}){f}^{i},\end{array}$(4)
where Z^{i} denotes the node value of the ith layer, w^{i} and b^{i} represent the weights matrix and bias corresponding to each layer, respectively. f^{i} refers to the activation function, which utilizes the ReLu function in this work. Since this module is aimed to cope with the regression issue, no activation function is employed on the output layer.
The output h_{n} was subsequently integrated with the original input H to yield the updated sequence ${H}^{\prime}$= (h_{0}, h_{1}, … , h_{n}). The core framework employed in AGAT is the GAT, which leverage the attention mechanism to compute weights between each vertex and its neighbors during the message passing phase. As illustrated in Figure 4C, within this phase, the information associated with each vertex is aggregated with its adjacent vertices and connected edges in an attention strategy, consequently resulting in updated information. This iterative process continues until a definitive representation for each vertex is achieved. In the message passing approach, the update of the vertex ${h}_{i}{}^{t}$ is presented in Eq. (5).
$\begin{array}{c}{h}_{i}{}^{t+1}=U({h}^{t}{}_{i},\{{h}_{j},{e}_{ji}\}),\end{array}$(5)
where U is the update function, ${h}_{i}{}^{t+1}$ is the updated information of the vertex, h_{j} refers to the neighboring vertices and e_{ij} represents the edges connecting the neighbors to the vertex h_{i}.
In the framework of AGAT, the update function was derived based on the multihead attention mechanism. Firstly, the importance of a neighboring vertex h_{j} was learnt and the attention score was calculated according to Eq. (6)
$\begin{array}{c}{\partial}_{ij}={\text{softmax}}_{j}(a(w{h}_{j},w{h}_{i}))=\frac{\text{exp}(a(w{h}_{j},w{h}_{i}))}{{\displaystyle \sum _{j\in {N}_{i}}\text{exp}(a(w{h}_{j},w{h}_{i}))}},\end{array}$(6)
where w is the trainable weight, a denotes the selfattention calculation, h_{i} represents the node information in the graph, h_{j} refers to a neighboring node information of h_{i}, and N_{i} represents the set of neighbors of node i. The softmax function used here is for normalizing the attention scores. The AGAT employs the multihead attention mechanism, and thus the aggregated information of node i, denoted as ${{h}^{\prime}}_{i}$, is derived in Eq. (7).
$\begin{array}{c}{{h}^{\prime}}_{i}=\Vert {}_{k=1}^{K}\sigma \left({\displaystyle \sum _{j\in {N}_{i}}{\partial}_{ij}^{k}{w}^{k}{h}_{j}}\right),\end{array}$(7)
where $\sigma $ refers to the activation function, which is LeakyReLU in this work, and k denotes the number of dependent attention mechanism. Five hidden layers of GAT convolutions were utilized in the AGAT, each consisting of an 8head attention mechanism. After all vertices have been aggregated, the updated sequence information, denoted as ${H}^{\u2033}=({{h}^{\prime}}_{0},{{h}^{\prime}}_{1},\mathrm{...},{{h}^{\prime}}_{n})$ was sent into the final layer of AGAT. In this layer, a singlehead attention mechanism was employed to calculate predictive values.
RESULTS
Data collection, metrics, and crossvalidation
To verify the effectiveness of the proposed model, in this section, the AGAT model was trained on the collected dataset. As shown in Figure 5A, the ML model provides an inexpensive relationship between input parameter, denoted as X_{p} = [x_{p}^{1}, x_{p}^{2}, x_{p}^{3}, x_{p}^{4}] and representing the structure (specifically the CNT radius, the CNT length, the volume ratio, and the CNT quantity) of the pth sample, and the corresponding resistance response as a 6dimensinal vector R_{p} = [r_{p}^{1}, r_{p}^{2}, r_{p}^{3}, r_{p}^{4}, r_{p}^{5}, r_{p}^{6}], consisting of electrical resistance values at 0%, 4%, 8%, 12%, 16%, and 20% strains, respectively. In the data collection process, 63 samples from the open Refs. [14,15,17,35,36] and generated DFT simulations (see Table S1) were gathered to train the DF module in the AGAT model which serves for predicting the Young’s modulus of the single CNT from its length and radius. Afterwards, the Young’s modulus was passed to COMSOL software before the FEM simulation. To this end, 230 numerically generated resistance responses were obtained as the outputs to train the remaining parameters of the AGAT model (see Table S2).
Figure 5 (A) The structure parameter vector X_{p} = [x_{p}^{1}, x_{p}^{2}, x_{p}^{3}, x_{p}^{4}] is fed to the AGAT model to predict the corresponding electrical resistance vector R_{p} = [r_{p}^{1}, r_{p}^{2}, r_{p}^{3}, r_{p}^{4}, r_{p}^{5}, r_{p}^{6}]; (B) the predictive performance of the AGAT model. 
To address the accuracy of the proposed AGAT model, various metrics including the coefficient of determination (R^{2}), the mean absolute error (MAE), and the mean squared error (MSE) were adopted to evaluate the performance of the trained model, which is calculated according to Eqs. (8)‒(10).
$\begin{array}{c}{R}^{2}=1{\displaystyle \sum _{i=1}^{N}{\displaystyle \sum _{k=1}^{6}{({r}_{i}^{k}{\widehat{r}}_{i}^{k})}^{2}}}/{\displaystyle \sum _{i=1}^{N}{\displaystyle \sum _{k=1}^{6}{({r}_{i}^{k}{\overline{r}}_{i}^{k})}^{2}}},\end{array}$(8)
$\begin{array}{c}\text{MSE}=\frac{1}{6N}{\displaystyle \sum _{i=1}^{N}{\displaystyle \sum _{k=1}^{6}{({r}_{i}^{k}{\widehat{r}}_{i}^{k})}^{2}}},\end{array}$(9)
$\begin{array}{c}\text{MAE}=\frac{1}{6N}{\displaystyle \sum _{i=1}^{N}{\displaystyle \sum _{k=1}^{6}\left{r}_{i}^{k}{\widehat{r}}_{i}^{k}\right}},\end{array}$(10)
where N represents the number of samples, r_{i} indicates the target output, ${\widehat{r}}_{i}$ denotes the predicted output, and $\overline{r}$ refers to the average value of the target outputs.
To achieve robust outcomes, the above criteria were derived through the 5fold crossvalidation. In each iteration of the 5fold crossvalidation, the dataset was spilt into five distinct subsets, with four of them used for training while the remaining subset served as the testing data. The average value obtained from each iteration was subsequently regarded as the final accuracy of the model.
Hyperparameter tunning
The hyperparameters play a crucial role in determining the accuracy, robustness, and overall capabilities of the ML model. Hence, in the pursuit of optimizing the performance of the proposed model, extensive experiments were conducted to examine the impact of the hyperparameters on the model’s efficacy (see Tables S3‒S5). The mean MAE, MSE, and R^{2} values are presented in Tables 2 and 3, where these values are intricately tied to the varying quantities of hidden features embedded within each layer, the count of attention mechanism heads, and the training batch size employed, respectively. It should be noted that the MAE and MSE values were calculated on the normalized data.
Influence of the hidden feature number and heads of attention mechanism on the model performance
Influence of the training batch size on the model performance
As shown in Table 2, the R^{2} value exhibits a declining trend with the increase in the count of hidden features. At the same time, the MSE value reaches its lowest point with 0.0060 when the model has precisely 16 hidden features. This observation can be attributed to the constrained set of six input attributes and a sample size of 230 instances. When dealing with a smaller dataset, networks with fewer hidden features tend to perform better on the testing data. With this optimal hidden feature number, the focus then shifts to the number of attention mechanism heads used. The findings from Table 2 reveal that the MSE value is minimized when employing an 8head attention mechanism, indicating its optimal performance. Correspondingly, with the same configuration, a peak R^{2} value of 0.7852 is observed.
Table 3 illustrates that the MAE, MSE, and R^{2} value exhibit minimal fluctuations as the training batch size reaches 16 or exceeds it. Notably, these metrics demonstrate superior performance when the batch size is set to 16. Specifically, the MSE value remains constant at 0.0060 for batch sizes of 16, 32, 64, and 128. Additionally, the R^{2} value shows little variance, with the largest value of 0.7883 occurring at a batch size of 16.
Model performance
A twostage training method was utilized in this study, wherein the DF module was initially trained independently based on the collected DFT data for predicting the Young’s modulus of the single CNT. Subsequently, the remaining parameters of the AGAT model were trained on 230 numerically generated data points. Consequently, the predictive accuracies of the DF module and the overall AGAT were both evaluated. The DF module exhibited an impressive R^{2} value of 0.93, indicating its ability to accurately predict the Young’s modulus. The predictive accuracy of the overall AGAT model concerning the electrical resistance of the CNTs/PDMS composite under various strain levels (0%, 4%, 8%, 12%, 16%, and 20%) is presented in Figure 5B. The figure visually represents the model’s performance in predicting each sequential data point within the series R_{p} = [r_{p}^{1}, r_{p}^{2}, r_{p}^{3}, r_{p}^{4}, r_{p}^{5}, r_{p}^{6}]. Notably, The R^{2} values corresponding to these six outputs all surpass 0.79, thereby implying an acceptable level of prediction accuracy. In addition, the R^{2} value exhibits small fluctuation among the six sequential data points. This consistent behavior reinforces the notion of the model’s efficacy in reliably predicting electrical resistances under varying strain conditions. The results demonstrate that the AGAT model can effectively predict the electrical resistances under different strains.
Ablation study
To gain deeper insights into the architecture of the AGAT model, we conducted an ablation study encompassing two distinct experiments. The initial experiment aimed to assess the effect of the reduction in network layers. Table 4 presents a quantitative comparison across varying quantities of hidden layers (from the 2nd row to 6th row). Table 4 reveals the significant impact of the network layers on the model performance. Notably, when network layers are diminished, a marked decrement in performance is observed. This decrement is especially pronounced in the model comprising only a singular layer, which is considered inefficient due to its negative R^{2} value. Conversely, as the count of network layers increases, a conspicuous increase in R^{2} values is observed, accompanied by simultaneous reductions in MAE and MSE metrics. However, this trend slightly continues from four layers to five layers. The observations derived from this experiment suggest that the number of network layers is a critical factor contributing to the AGAT model’s performance. An insufficient number of layers yield unsatisfactory performance, while the integration of additional layers enhances the model’s predictive capabilities. However, it is worth noting that further improvement tapers off after surpassing a certain layer threshold.
Ablation study on the hidden layer variations and the DF module in the AGAT model
To examine the significance and enhancements offered by the DF module designed to bridge microscale calculations to macroscale simulations, the second ablation study was conducted. This study involved the removal of the DF module from the model architecture. The detailed performance of the model without DF module is displayed in Table 4 (the 1st row). Compared with the metrics in the 6th row, it is evident that the exclusion of the DF module leads to increases in both the MAE and MSE values, rising from 0.0543 to 0.0584 and from 0.0060 to 0.0067, respectively. Simultaneously, the R^{2} value experiences a reduction from 0.7883 to 0.7548. These outcomes underscore that the utilization of the AGAT multiscale approach outperforms the original GAT model, which is solely trained based on the single FEM data sources. This highlights the added value of the DF module in enhancing the model’s predictive capabilities across multiple scales.
DISCUSSION
This study presents a novel approach, designated as the AGAT model, for predicting the electrical response of the CNTs/PDMS composite using a GATbased ML model. Within a specified mesostructure, the trained AGAT model can rapidly and directly predict the sensing behavior of the CNTs/PDMS composite without conducting the physicsbased multiscale simulations. Initially, the multiscale framework starts with the DFT calculation and the FEM simulation to generate datasets. Based on the collected datasets, the AGAT model is trained with mesostructural characteristics of CNTs/PDMS composites as inputs, namely the CNT radius, the CNT length, the volume ratio, and the CNT quantity. The resulting electrical resistance sequence recorded at strain levels of 0%, 4%, 8%, 12%, 16%, and 20%, constitutes the output of the model. To assess the model’s efficacy, three metrics including the R^{2}, MAE, and MSE are employed. By performing the 5fold cross validation, the R^{2} values of the six output targets all exceed 0.79, underscoring a good correspondence between the predicted values and ground truth. Furthermore, insights from the ablation study indicate that the optimal network architecture consists of five hidden layers, with the integration of the DF module playing a role in connecting multiscale simulations.
CNTs/PDMS composites show great potential in electronic devices owing to the CNTs’ interlaced networking structure which is changeable in response to external forces. By employing the proposed AGAT model, the highefficiency exploration of the optimal structure influencing the sensing property could be achieved. While in this study we have focused on the electrical resistancestrain response of CNTs/PDMS composites, the satisfactory predictions made by AGAT also make it promising for application on other kinds of materials, especially nanocomposites. The nanocomposites consist of multiple phases where at least one, two or three dimensions are in the nanometer range. The reinforcing material is usually in a dispersed phase, while the matrix material is in a continuous phase. Hence, the material properties of each phase, together with their structural assembly would both significantly influence the final property. In this case, multiscale simulations play a crucial role by utilizing microscale calculations to determine nanoscale material properties and macroscale simulations to optimize the material structure. The proposed MLenabled multiscale strategy provides a platform for accelerating multiscale simulations from the microscale domain to the macroscale domain. In this multiscale framework, the embedding of a welldesigned module that acts as a vital bridge between the microscale and the larger scale facilitates the training of the ML model through the hierarchical multiscale simulation, thus endowing it with the ability to efficiently predict material properties.
However, a challenge resides in the scale disparity that characterizes multiscale simulations. For example, in the investigation of CNTs/PDMS composites, current literature on CNT property calculations using DFT reveal nanotube lengths on the order of several tens of nanometers. In contrast, in literature on finite element analyses at meso or macro scales, CNTs within CNTs/PDMS composites can extend to lengths of several thousands of nanometers. To overcome this limitation, the present study conducted various DFT simulations focusing on CNTs with lengths surpassing 1000 nm. However, the extended CNT structures entail a considerable computational burden due to their sizable unit cells (over 10,000 atoms of a single CNT with 15 nm in radius and 1000 nm in length), making it extremely expensive to accumulate a sufficient number of data samples. Another challenge lies in the potential errors introduced by the simulationbased calculations. In this study, the ML model was established mainly using simulation data, which could lead to nonnegligible mismatches when compared with experimental measurements. However, acquiring sufficient experimental data poses a grand challenge which can be hardly realized. Hence, efforts should be made to enhance the accuracy of the ML model by leveraging these large, comparatively “inexpensive” simulation datasets alongside the limited, expensive experimental datasets, using transfer learning, pretrained models, or other relevant techniques. The third challenge emerges in the development of optimal sampling algorithm, demanding an efficient strategy to ensure the incorporation of data samples of pronounced significance. The collection of datasets for training requires the combination of simulations across microscale to macroscale, where the hierarchical approach is often preferred over simultaneous execution, leading to a numerous time demand. Consequently, strategies for data sampling need further exploration to ascertain the minimal requisite number of multiscale simulations while preserving the uncompromised performance of the ML model.
Funding
This work was supported by the National Key R&D Program of China (2022ZD0117501), and the National Natural Science Foundation of China (62201441).
Author contributions
L.Y. and X.W designed the research and analyzed the data. X.W. supervised the project. C.Z. and J.C. conducted the FEM simulation. Z.S. and H.G. collected the FEM data. H.D. collected the microscale calculation data. M.Z. performed the hyperparameter tunning experiments. L.Y. and X.W. cowrote the manuscript. All authors contributed to discussions.
Conflict of interest
The authors declare no conflict of interest.
Supplementary information
The supporting information is available online at https://doi.org/10.1360/nso/20230055. The supporting materials are published as submitted, without typesetting or editing. The responsibility for scientific accuracy and content remains entirely with the authors.
References
 Zhu B, Wang H, Leow WR, et al. Silk fibroin for flexible electronic devices. Adv Mater 2016; 28: 42504265. [Article] [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Yang H, Li J, Lim KZ, et al. Automatic strain sensor design via active learning and data augmentation for soft machines. Nat Mach Intell 2022; 4: 8494. [Article] [Google Scholar]
 Cullinan MA, Culpepper ML. Carbon nanotubes as piezoresistive microelectromechanical sensors: Theory and experiment. Phys Rev B 2010; 82: 115428. [Article] [NASA ADS] [CrossRef] [Google Scholar]
 Grabowski K, Zbyrad P, Uhl T, et al. Multiscale electromechanical modeling of carbon nanotube composites. Comput Mater Sci 2017; 135: 169180. [Article] [CrossRef] [Google Scholar]
 Huang J, Yang X, Liu J, et al. Vibration monitoring based on flexible multiwalled carbon nanotube/polydimethylsiloxane film sensor and the application on motion signal acquisition. Nanotechnology 2020; 31: 335504. [Article] [Google Scholar]
 Rosle MH, Wang Z, Shiblee MNI, et al. Soft resistive tactile sensor based on CNTPDMSgel to estimate contact force. IEEE Sens Lett 2022; 6: 14. [Article] [CrossRef] [Google Scholar]
 Bao WS, Meguid SA, Zhu ZH, et al. A novel approach to predict the electrical conductivity of multifunctional nanocomposites. Mech Mater 2012; 46: 129138. [Article] [NASA ADS] [CrossRef] [Google Scholar]
 Hu N, Karube Y, Yan C, et al. Tunneling effect in a polymer/carbon nanotube nanocomposite strain sensor. Acta Mater 2008; 56: 29292936. [Article] [CrossRef] [Google Scholar]
 Arora G, Pathak H. Modeling of transversely isotropic properties of CNTpolymer composites using mesoscale FEM approach. Compos Part BEng 2019; 166: 588597. [Article] [CrossRef] [Google Scholar]
 Zhang C, CurielSosa JL, Bui TQ. Comparison of periodic mesh and free mesh on the mechanical properties prediction of 3D braided composites. Composite Struct 2017; 159: 667676. [Article] [CrossRef] [Google Scholar]
 Zhang C, CurielSosa JL, Bui TQ. A novel interface constitutive model for prediction of stiffness and strength in 3D braided composites. Composite Struct 2017; 163: 3243. [Article] [CrossRef] [Google Scholar]
 Li CM, Li CY, Zhang CC, et al. Simulation on electrical conductivity of CNTs/PE composites. Adv Mater Res 2014; 1035: 408412. [Article] [CrossRef] [Google Scholar]
 Ebbesen TW, Lezec HJ, Hiura H, et al. Electrical conductivity of individual carbon nanotubes. Nature 1996; 382: 5456. [Article] [NASA ADS] [CrossRef] [Google Scholar]
 Bao WX, Zhu CC, Cui WZ. Simulation of Young’s modulus of singlewalled carbon nanotubes by molecular dynamics. Phys BCondensed Matter 2004; 352: 156163. [Article] [Google Scholar]
 Wagner C, Schuster J, Gessner T. DFT investigations of the piezoresistive effect of carbon nanotubes for sensor application. Phys Status Solidi (b) 2012; 249: 24502453. [Article] arxiv:1706.09621 [NASA ADS] [CrossRef] [Google Scholar]
 Wei X, Liu Y, Chen Q, et al. The verylow shear modulus of multiwalled carbon nanotubes determined simultaneously with the axial Young’s modulus via in situ experiments. Adv Funct Mater 2008; 18: 15551562. [Article] [Google Scholar]
 PeraltaInga Z, Boyd S, Murray JS, et al. Density functional tightbinding studies of carbon nanotube structures. Struct Chem 2003; 14: 431443. [Article] [CrossRef] [Google Scholar]
 Fish J, Wagner GJ, Keten S. Mesoscopic and multiscale modelling in materials. Nat Mater 2021; 20: 774786. [Article] [Google Scholar]
 Ghaffari MA, Zhang Y, Xiao S. Molecular dynamics modeling and simulation of lubricant between sliding solids. J Micromech Mol Phys 2017; 02: 1750009. [Article] [Google Scholar]
 Xiao S, Hu R, Li Z, et al. A machinelearningenhanced hierarchical multiscale method for bridging from molecular dynamics to continua. Neural Comput Applic 2019; 32: 1435914373. [Article] [Google Scholar]
 Subramanian N, Rai A, Chattopadhyay A. Atomistically informed stochastic multiscale model to predict the behavior of carbon nanotubeenhanced nanocomposites. Carbon 2015; 94: 661672. [Article] [NASA ADS] [CrossRef] [Google Scholar]
 Jiang S, Tao J, Sewell TD, et al. Hierarchical multiscale simulations of crystalline βoctahydro1,3,5,7tetranitro1,3,5,7tetrazocine (βHMX): Generalized interpolation material point method simulations of brittle fracture using an elastodamage model derived from molecular dynamics. Int J Damage Mech 2017; 26: 293313. [Article] [CrossRef] [Google Scholar]
 Brunton SL, Kutz JN. Methods for datadriven multiscale model discovery for materials. J Phys Mater 2019; 2: 044002. [Article] [CrossRef] [Google Scholar]
 Pattnaik P, Sharma A, Choudhary M, et al. Role of machine learning in the field of fiber reinforced polymer composites: A preliminary discussion. Mater TodayProc 2021; 44: 47034708. [Article] [CrossRef] [Google Scholar]
 Sun X, Yue L, Yu L, et al. Machine learningevolutionary algorithm enabled design for 4Dprinted active composite structures. Adv Funct Mater 2021; 32: 2109805. [Article] [Google Scholar]
 Milad A, Hussein SH, Khekan AR, et al. Development of ensemble machine learning approaches for designing fiberreinforced polymer composite strain prediction model. Eng Comput 2022; 38: 36253637. [Article] [CrossRef] [Google Scholar]
 Marani A, Nehdi ML. Machine learning prediction of compressive strength for phase change materials integrated cementitious composites. Constr Build Mater 2020; 265: 120286. [Article] [CrossRef] [Google Scholar]
 Yuan M, Zhao H, Xie Y, et al. Prediction of stiffness degradation based on machine learning: Axial elastic modulus of [0_m/90_n]_s composite laminates. Compos Sci Tech 2022; 218: 109186. [Article] [CrossRef] [Google Scholar]
 Liu B, VuBac N, Rabczuk T. A stochastic multiscale method for the prediction of the thermal conductivity of polymer nanocomposites through hybrid machine learning algorithms. Composite Struct 2021; 273: 114269. [Article] [CrossRef] [Google Scholar]
 Huang JS, Liew JX, Liew KM. Datadriven machine learning approach for exploring and assessing mechanical properties of carbon nanotubereinforced cement composites. Composite Struct 2021; 267: 113917. [Article] [CrossRef] [Google Scholar]
 Le TT. Prediction of tensile strength of polymer carbon nanotube composites using practical machine learning method. J Composite Mater 2020; 55: 787811. [Article] [Google Scholar]
 Xiao S, Deierling P, Attarian S, et al. Machine learning in multiscale modeling of spatially tailored materials with microstructure uncertainties. Comput Struct 2021; 249: 106511. [Article] [CrossRef] [Google Scholar]
 Matouš K, Geers MGD, Kouznetsova VG, et al. A review of predictive nonlinear theories for multiscale modeling of heterogeneous materials. J Comput Phys 2017; 330: 192220. [Article] [CrossRef] [MathSciNet] [Google Scholar]
 Hansoge NK, Huang T, Sinko R, et al. Materials by design for stiff and tough hairy nanoparticle assemblies. ACS Nano 2018; 12: 79467958. [Article] [CrossRef] [PubMed] [Google Scholar]
 Wu Y, Huang M, Wang F, et al. Determination of the Young’s modulus of structurally defined carbon nanotubes. Nano Lett 2008; 8: 41584161. [Article] [NASA ADS] [CrossRef] [PubMed] [Google Scholar]
 Krishnan A, Dujardin E, Ebbesen TW, et al. Young’s modulus of singlewalled nanotubes. Phys Rev B 1998; 58: 1401314019. [Article] [NASA ADS] [CrossRef] [Google Scholar]
All Tables
Influence of the hidden feature number and heads of attention mechanism on the model performance
Ablation study on the hidden layer variations and the DF module in the AGAT model
All Figures
Figure 1 The diagram of the MLenabled multiscale approach for predicting the sensing behavior of CNTs/PDMS composites. 

In the text 
Figure 2 The flowchart of the RVE generation algorithm. 

In the text 
Figure 3 The FEM simulation of the CNTs/PDMS composite. (A) Tunnel effect; (B) boundary conditions; (C) distributions of electrical potential under different strains. 

In the text 
Figure 4 The architecture of AGAT. (A) The DF module constructed using ANN networks for predicting the Young’s modulus of an individual CNT; (B) integration of the value generated by the DF module with the original input; (C) configuration of the GAT with initial five layers of GAT convolutions, each featuring an 8head attention mechanism, and a final layer with a singlehead attention mechanism. 

In the text 
Figure 5 (A) The structure parameter vector X_{p} = [x_{p}^{1}, x_{p}^{2}, x_{p}^{3}, x_{p}^{4}] is fed to the AGAT model to predict the corresponding electrical resistance vector R_{p} = [r_{p}^{1}, r_{p}^{2}, r_{p}^{3}, r_{p}^{4}, r_{p}^{5}, r_{p}^{6}]; (B) the predictive performance of the AGAT model. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.