Transfer Learning in Computational Materials Engineering: Techniques and Case Studies

Daniel Fischer; Laura Meier; Thomas Braun; Stefan Koch

Abstract

Transfer learning has become a cornerstone of computational materials engineering, addressing the fundamental tension between the exponential growth of high-throughput simulation data and the persistent scarcity of high-fidelity experimental labels. By repurposing knowledge encoded in large-scale computational repositories—ranging from density-functional theory (DFT) databases to molecular dynamics trajectories—transfer learning enables accurate property prediction, inverse design, and autonomous discovery even in data-constrained regimes. This review synthesizes the field’s maturation from early domain-adaptation approaches in microstructure informatics to contemporary foundation-model strategies that span inorganic crystals, organic polymers, and hybrid interfaces. We trace the evolution of techniques including graph-neural-network (GNN) pre-training, multi-fidelity fusion, and structure-aware fine-tuning, while highlighting their deployment in closed-loop pipelines that couple simulation with robotic experimentation. Case studies drawn from battery electrolytes, high-entropy alloys, and 2D heterostructures illustrate how hierarchical transfer frameworks achieve chemical accuracy with orders-of-magnitude fewer labels than scratch-trained models. The synthesis reveals a unifying computational workflow: pre-train on universal descriptors, adapt via frozen or low-rank updates, and close the loop through uncertainty-guided active learning. This infrastructure-level perspective underscores transfer learning’s role in transforming materials engineering from a trial-and-error discipline into a predictive, self-optimizing ecosystem.

Introduction

The materials genome initiative and its successors have generated petabyte-scale repositories of first-principles data, yet the translation of these resources into deployable technologies remains bottlenecked by the high cost and low throughput of experimental validation [1, 2]. In computational materials engineering, this asymmetry manifests most acutely in the small-data regime: a new alloy composition may have thousands of DFT-relaxed structures but only a handful of measured creep lifetimes; a polymer electrolyte dataset may contain millions of simulated ion diffusivities but fewer than fifty validated conductivity values at operating temperature. Traditional supervised learning collapses under such sparsity, producing models that overfit noise rather than generalize across chemical space [3, 4].

Transfer learning offers a principled escape from this impasse. By initializing model parameters with knowledge distilled from related but more abundant domains, practitioners can dramatically reduce the data volume required for target-task convergence. In materials contexts, the source domain is typically a large computational corpus (e.g., the Materials Project or Alexandria database), while the target domain is a sparse experimental or high-accuracy subset (e.g., CALPHAD-validated phase diagrams or in-operando battery cycling data). The transfer can occur at multiple levels: feature representations extracted from pre-trained GNNs [5, 6], low-rank adaptation of attention weights [7, 8], or even entire frozen encoders whose embeddings serve as universal descriptors [9, 10].

The intellectual lineage of transfer learning in materials traces to early microstructure reconstruction efforts that repurposed convolutional filters trained on natural images for phase-field simulations [11]. These approaches demonstrated that visual inductive biases—translation equivariance, hierarchical feature abstraction—translate surprisingly well to atomic-scale patterns. Subsequent work extended the paradigm to graph-structured data, where crystal graphs or molecular adjacency matrices replace pixel grids [5, 12]. The emergence of universal interatomic potentials further accelerated the field: models such as MACE and CHGNet, pre-trained on millions of DFT calculations across the periodic table, now serve as drop-in feature extractors for downstream tasks ranging from phonon spectra to defect migration barriers [6, 13].

Beyond accuracy gains, transfer learning reshapes the very architecture of materials discovery pipelines. In high-throughput virtual screening, a transferred model can triage billions of hypothetical compounds in hours rather than weeks, surfacing candidates whose subsequent DFT refinement is focused on the most promising 0.1 % of chemical space [14]. In inverse design, generative models conditioned on transferred representations can propose structures that simultaneously satisfy multiple conflicting objectives—high ionic conductivity and mechanical robustness, for instance—by borrowing latent manifolds learned from unrelated property spaces [15, 16]. Most profoundly, transfer learning underpins the transition from open-loop computation to closed-loop autonomy: pre-trained surrogates provide the rapid forward models required for Bayesian optimization loops, while their uncertainty estimates guide the selection of experiments that maximally reduce epistemic ignorance [17, 18].

This review adopts a systems-integration perspective. Rather than cataloging isolated benchmarks, we organize the literature around the computational workflows that transfer learning enables: (i) the construction of transferable representations, (ii) their adaptation to heterogeneous materials classes, and (iii) their embedding within autonomous discovery architectures. By synthesizing cross-study analyses—comparing GNN transfer versus kernel methods on the same datasets, or multi-fidelity versus single-fidelity strategies in alloy design—we reveal emergent principles that transcend individual publications. The result is a coherent narrative of how transfer learning is not merely an efficiency hack but the enabling infrastructure for a new generation of predictive, self-driving materials laboratories.

Landscape of Computational & Data-Driven Materials Engineering

Representation learning foundations for transfer

At the core of effective transfer lies the construction of material embeddings that are simultaneously expressive and transferable. Early descriptor-based approaches relied on hand-crafted features—coordination numbers, Voronoi polyhedra statistics, or SOAP kernels—whose transferability was constrained by domain-specific tuning and feature engineering assumptions [19]. While chemically interpretable, these descriptors encoded local environments in ways that limited extrapolation across compositional or structural regimes not represented in their calibration datasets. As materials discovery expanded toward high-dimensional chemical spaces, the limitations of fixed descriptor vocabularies became increasingly evident.

The transition toward learned representations, particularly those derived from graph neural networks (GNNs), marked a qualitative shift in representational philosophy. By propagating messages along interatomic edges, GNNs encode both localized bonding chemistry and extended periodic interactions in a unified data-driven framework [5, 20]. This relational encoding enables the simultaneous capture of coordination environments, bond polarity distributions, and lattice connectivity patterns within learned embedding spaces. Unlike handcrafted descriptors, these embeddings evolve during training, internalizing structural regularities directly from data.

Pre-training strategies have further expanded this representational capacity. When GNNs are trained on large density functional theory corpora using self-supervised objectives—such as masked atom prediction, structural completion, or total energy regression—the resulting node and graph embeddings encode transferable chemical priors spanning wide compositional domains [6]. These embeddings capture recurring bonding motifs, symmetry regularities, and thermodynamic trends that generalize across unseen materials systems, thereby forming the epistemic substrate upon which downstream transfer tasks operate.

Hierarchical transfer frameworks such as AtomSets operationalize this universality through architectural modularization. Multi-scale descriptors extracted from frozen GNN layers are routed into lightweight perceptrons tailored to small target datasets, effectively decoupling representation learning from property-specific fitting [9]. This separation allows models trained on datasets exceeding 100 000 structures to inform predictive systems built from only hundreds of labeled samples. The approach preserves inductive biases embedded within large-scale pre-training while minimizing overfitting risks in data-sparse regimes.

Complementary strategies employ contrastive learning to align computational and experimental embeddings within shared latent spaces. By enforcing representational proximity between simulated structures and experimentally characterized counterparts, these approaches enable zero-shot transfer across characterization modalities, bridging the simulation–experiment divide that historically constrained materials prediction workflows [21, 22]. A comparative synthesis of dominant transfer learning strategies and their discovery roles is summarized in Table 1.

Table 1. Comparative Transfer Learning Strategies in Computational Materials Engineering

Transfer Strategy	Source Domain Representation	Adaptation Mechanism	Target Applications	Discovery Advantage
GNN Pre-Training	Large DFT crystal graphs	Fine-tuning / frozen encoders	Property prediction, defect energetics	Captures universal chemical motifs
Multi-Fidelity Transfer	GGA + hybrid DFT datasets	Residual correction / feature fusion	Alloy stability, reaction energetics	Reduces high-cost computation burden
Contrastive Cross-Modal Transfer	Simulation + experimental embeddings	Latent space alignment	Spectroscopy prediction, structure validation	Bridges simulation–experiment divide
Adapter-Based Domain Transfer	Molecular ↔ inorganic graphs	Parameter-efficient adapters	Polymer design, hybrid materials	Enables cross-class knowledge reuse
Multimodal Transfer Learning	Graphs + spectra + diffraction	Cross-modality embedding loss	Autonomous labs, real-time inference	Integrates heterogeneous evidence streams
Foundation Interatomic Potentials	Periodic table-scale training	Downstream surrogate modeling	Phonons, diffusion, phase transitions	Universal transferable force fields

Graph neural networks as transfer engines

GNN architectures have consequently become the de-facto backbone of transfer learning in materials informatics. Their relational inductive biases allow them to encode physically meaningful dependencies that remain stable across property domains. The crystal graph convolutional neural network (CGCNN) family exemplifies this adaptability, having been fine-tuned from formation-energy pre-training to predict a wide spectrum of properties—including band gaps, elastic constants, and thermodynamic stability—using comparatively limited additional supervision [7, 8, 12].

Structure-aware variants that explicitly encode symmetry breaking further enhance transferability. Through equivariant message passing operations and directional bond encodings, these models capture anisotropic interactions and defect-induced distortions that conventional invariant networks struggle to represent [6-8]. This structural sensitivity enables transfer across low-symmetry systems such as grain boundaries, dislocation networks, and interfacial heterostructures.

Benchmark studies reinforce the importance of compositional breadth in source domains. Models pre-trained on chemically diverse datasets spanning the periodic table routinely outperform domain-specific architectures even when applied to narrowly scoped target tasks [20]. Such findings suggest that exposure to broad chemical variability cultivates more generalizable embeddings, embedding transferable heuristics regarding bonding, coordination, and stability landscapes.

Multi-task learning architectures amplify these gains by jointly optimizing multiple correlated properties within a single backbone network. By learning formation energies, electronic structures, and thermodynamic descriptors simultaneously, these systems embed latent correlations between properties into shared feature spaces, improving downstream transfer relative to isolated single-task models [21, 23].

In multi-fidelity settings, transfer learning exploits hierarchical data quality structures. Low-cost generalized-gradient-approximation calculations provide abundant source data, while hybrid-functional simulations or experimental measurements form sparse target datasets. Transfer is achieved through feature fusion strategies or residual correction models trained on fidelity gaps, enabling predictive calibration without requiring uniformly high-cost computations [21, 22].

Cross-domain and multimodal transfer

The most ambitious transfer paradigms extend beyond property prediction into cross-class knowledge migration. Knowledge-reuse strategies distill chemical heuristics from molecular datasets into inorganic crystal prediction frameworks, or conversely, using adapter modules that preserve source model parameters while learning domain-specific transformations [16]. This modular transfer preserves learned chemical priors while accommodating structural divergence between materials families.

In polymer informatics, representations learned from small-molecule conformers transfer to extended chain architectures when augmented with graph coarsening layers that encode repeat-unit symmetries [15]. Such approaches bridge length-scale discontinuities that historically separated molecular and macromolecular modeling regimes.

For two-dimensional materials and heterostructures, transfer occurs across stacking registries, twist angles, and interlayer coupling configurations. Modular encodings of van der Waals interactions and lattice misalignment physics enable predictive generalization across configurationally diverse layered systems [24].

Multimodal integration represents the frontier of transfer learning. By combining atomic structure graphs with spectroscopic fingerprints, diffraction signatures, and natural-language synthesis descriptions, cross-modality embedding objectives align disparate data streams within unified representation spaces. This alignment allows a single transferred model to ingest both DFT-relaxed geometries and experimental X-ray absorption spectra for property inference [25]. Such multimodal transfer architectures are particularly consequential in autonomous laboratory environments, where real-time sensor data must be interpreted alongside simulation priors.

Case studies in property prediction and design

Concrete demonstrations underscore the operational impact of transfer frameworks. In single-crystal superalloy design, a GNN pre-trained on binary and ternary phase diagrams was fine-tuned using sparse creep-rupture datasets. The transferred model identified a novel composition exceeding prior temperature capability benchmarks by 40 °C, with experimental validation achieved within weeks rather than traditional multi-year development cycles [24].

In solid-state electrolyte discovery, transfer from oxide conductivity datasets to sulfide chemistries using frozen graph encoders reduced required training data from thousands to only dozens of measurements. This efficiency enabled high-throughput screening of 10⁵ candidate compositions, dramatically accelerating ionic conductor exploration [14].

Microstructure informatics provides another compelling illustration. Transfer learning frameworks reconstructed three-dimensional grain boundary networks from two-dimensional electron backscatter diffraction slices with predictive accuracy rivaling full serial sectioning—an achievement unattainable through scratch-trained models given the prohibitive data acquisition burden [11].

Across these case studies, a consistent structural logic emerges. Source domains contribute inductive physical priors—symmetry constraints, bonding rules, thermodynamic principles—while target domains provide high-fidelity calibration signals. The fusion of these knowledge regimes enables predictive systems capable of extrapolative reasoning across expansive materials design spaces.

Autonomous & closed-loop discovery systems

The integration of transfer learning with autonomous experimentation has transformed materials discovery from a sequential, human-paced process into a parallel, self-optimizing ecosystem. At the heart of these systems lies a Bayesian active-learning loop in which a transferred surrogate model proposes the next experiment, an automated platform executes it, and the outcome refines the model for the subsequent iteration [17, 18]. An integrated schematic of this transfer-learning-enabled closed-loop discovery workflow is shown in Figure 1.

Figure 1. Transfer-learning-enabled closed-loop discovery architecture for computational materials engineering.

Figure 1. Transfer-learning-enabled closed-loop discovery architecture for computational materials engineering.

Figure 1. Transfer-learning-enabled closed-loop discovery architecture for computational materials engineering. A foundation graph neural network pre-trained on large density-functional-theory repositories is adapted to sparse experimental domains through fine-tuning or frozen-encoder transfer. The resulting surrogate model, coupled with uncertainty quantification, drives acquisition functions that balance exploration and exploitation. Selected candidates are evaluated through high-throughput simulation or robotic experimentation, and multimodal feedback—structural, spectroscopic, and electrochemical—updates the model iteratively, forming a self-optimizing materials discovery loop.

Early demonstrations of this paradigm, such as the CAMEO platform, employed Gaussian processes for surrogate modeling and achieved 10–100× acceleration in the discovery of novel inorganic compounds [17]. The incorporation of transferred GNNs has since extended the approach to far more complex design spaces. In the A-Lab autonomous laboratory, a pre-trained universal potential provided instantaneous energy landscapes for candidate synthesis routes, while an active-learning controller selected precursors and annealing conditions that maximized phase purity [18]. Transfer learning was critical: the model, initialized on millions of computed formation energies, required only a few dozen in-situ diffraction measurements to achieve >70 % success in synthesizing previously unreported phases.

More recent architectures network multiple autonomous platforms through shared transferred representations. A system exploring polymer electrolytes can borrow knowledge from a parallel campaign on battery cathodes by aligning their graph embeddings in a common latent space; the resulting knowledge transfer accelerates both campaigns beyond what either could achieve in isolation [19], adapted to networking concept]. Uncertainty-aware transfer further refines the loop: when epistemic uncertainty in a transferred region exceeds a threshold, the system preferentially queries high-fidelity oracles (e.g., synchrotron X-ray or neutron scattering) rather than defaulting to cheap simulations.

The computational workflow formalizes as a recursive update:

Discovery loop = Transfer(pre-trained model, target data) → Surrogate(property, uncertainty) → Acquisition(exploration + exploitation) → Experiment/Simulation → Feedback(update model or dataset) → repeat.

This cycle, when powered by efficient transfer mechanisms, compresses what once required years of sequential experimentation into days of parallel operation. The systems perspective reveals that transfer learning is not an add-on but the connective tissue that makes closed-loop autonomy scalable: it supplies the rapid, generalizable surrogates without which active learning would remain computationally prohibitive, and it provides the domain-bridging representations that allow knowledge to flow between disparate experimental modalities and material families.

Results and Discussion

The integration of knowledge graphs (KGs) into computational and data-driven materials engineering signals a structural reconfiguration of how knowledge is generated, organized, and operationalized across discovery ecosystems. Rather than functioning as passive repositories, KGs increasingly operate as epistemic infrastructures—dynamic systems that structure, contextualize, and activate materials knowledge across scales of reasoning, experimentation, and design. Synthesizing the reviewed literature, several interwoven themes emerge that collectively position KGs as catalytic enablers of next-generation materials innovation.

Interconnections between data structuring and reasoning

A foundational contribution of KGs lies in their capacity to fuse data structuring with machine reasoning. Automated graph construction from scientific text corpora, coupled with ontology alignment and terminology harmonization, establishes the semantic scaffolding required for downstream inference [9, 16, 24, 26]. These structuring processes transform fragmented datasets into relationally coherent knowledge spaces, where entities—materials, properties, synthesis pathways—are embedded within interpretable semantic networks.

Within such environments, reasoning mechanisms can operate with enhanced contextual awareness. Reinforcement learning frameworks leverage KG-encoded relationships to guide alloy optimization strategies, enabling agents to explore compositional spaces informed by prior metallurgical knowledge [13]. Similarly, knowledge-guided pre-training approaches integrate graph-structured priors into molecular representation learning, improving transferability and predictive fidelity across chemical domains [27].

Semantic integration further amplifies these reasoning capabilities. By unifying heterogeneous datasets—ranging from biomolecular systems to crystalline lattices—KGs support cross-domain inference, allowing insights from one materials class to inform another [6, 21, 22, 28]. This integrative capacity reframes data structuring as an active enabler of discovery rather than a preparatory step.

In materials informatics workflows, these interconnections manifest through graph neural networks (GNNs) and contrastive learning systems that embed relational dependencies directly into representation spaces [23, 29]. Functional prompting strategies, enriched by KG context, allow models to reason over mechanistic pathways rather than isolated descriptors. Consequently, long-standing limitations such as sparse data regimes are mitigated, enabling robust property prediction and inverse design across underexplored materials domains [15, 17, 27].

Critically, the structuring–reasoning relationship is bidirectional. Enhanced representations expose ontological gaps, prompting refinement of KG schemas. This recursive feedback loop drives co-evolution between knowledge architectures and learning models, fostering iterative maturation of the broader materials AI ecosystem.

Integration in autonomous systems and closed-loop workflows

Autonomous laboratories represent one of the most tangible realizations of KG-enabled intelligence. In these environments, KGs function as orchestration layers that coordinate experimental planning, execution, and interpretation within closed-loop discovery cycles [2, 3, 5, 12, 15].

Dynamic graph infrastructures capture provenance, track experimental states, and encode interdependencies between instruments, materials, and outcomes. Distributed KG architectures enable self-driving laboratories to operate collaboratively, exchanging knowledge in real time while maintaining traceability [3, 5, 12].

Active learning pipelines further demonstrate this integration. By embedding uncertainty metrics within KG nodes and edges, systems can prioritize high-information experiments, optimizing resource allocation and accelerating convergence toward target properties [17, 19]. Here, KGs act as decision substrates—linking probabilistic inference with experimental steering.

The fusion of simulation and experimentation is also mediated through KG frameworks. Multimodal infrastructures integrate computational predictions, laboratory measurements, and literature knowledge into unified graphs, enabling comparative reasoning across evidence modalities [18, 29, 30]. Event-sourced architectures extend this paradigm by chronologically encoding discovery trajectories, allowing retrospective analysis and adaptive workflow redesign.

Application-specific KGs—such as chemical reaction networks and natural products graphs—illustrate how relational modeling expands design spaces while preserving mechanistic coherence [11, 20]. These systems enable generative exploration without severing links to empirical feasibility.

Collectively, the literature positions KGs as operational conductors of autonomous discovery ecosystems, harmonizing robotics, AI, and high-throughput experimentation to compress innovation cycles and reduce reliance on manual oversight [2, 15, 19].

Cross-disciplinary insights and ecosystem building

The evolution of KG applications in materials engineering is deeply informed by precedents in biomedicine, pharmacology, and systems biology. Precision medicine graphs, for example, demonstrate how patient-specific data integration can guide personalized therapeutic strategies—an approach conceptually transferable to tailored materials design under context-specific performance constraints [7, 8, 10, 22, 25, 28].

Biological interaction graphs offer further analogical value. Models of cell signaling, protein interaction, and RNA communication provide structural templates for representing microstructural interfaces, defect interactions, and interphase phenomena in materials systems [14, 28]. These cross-domain translations enrich modeling vocabularies and expand representational imagination within materials informatics.

Open-source KG ecosystems also play a pivotal role in community formation. Collaborative graph platforms facilitate shared ontology development, standardized data pipelines, and interoperable reasoning tools, reducing fragmentation across institutional and disciplinary boundaries [10, 25].

Hackathon-driven explorations integrating large language models (LLMs) with KGs highlight emerging hybrid paradigms, where natural language extraction feeds structured graph reasoning pipelines [4, 19]. Such integrations enable rapid knowledge ingestion from literature while preserving relational rigor.

Community assessments consistently emphasize the need for unified data governance frameworks. KGs provide a structural solution, enabling federated knowledge integration while maintaining domain specificity [1]. This ecosystem perspective reframes KGs not merely as tools but as infrastructural backbones supporting interdisciplinary convergence across materials science, chemistry, biology, and computational engineering.

Implications for materials discovery paradigms

The cumulative impact of these developments is a paradigmatic shift in how materials discovery is conceptualized and operationalized. Traditional empiricism—anchored in sequential experimentation—is increasingly complemented, and in some contexts supplanted, by predictive and autonomous discovery frameworks.

KGs enable systems-level integration across modeling, experimentation, and design, dissolving silos that historically constrained innovation. Their relational architectures support inverse design strategies, where target functionalities guide backward reasoning through synthesis pathways and compositional spaces [6, 11, 15, 19].

Uncertainty-aware learning, embedded within graph structures, enhances epistemic transparency, allowing researchers to interrogate confidence landscapes alongside predictions. Distributed automation frameworks extend these capabilities globally, democratizing access to advanced discovery infrastructures.

From a sustainability perspective, KG-enabled discovery can accelerate identification of low-carbon materials, circular manufacturing pathways, and resource-efficient chemistries—aligning materials innovation with planetary imperatives.

Conclusion

Knowledge graphs are rapidly transitioning from auxiliary data tools to central intelligence architectures within computational materials engineering. Their ability to structure heterogeneous knowledge, enable context-aware reasoning, and orchestrate autonomous workflows positions them as foundational infrastructures for next-generation discovery ecosystems.

This review synthesizes cross-domain advances to demonstrate that the transformative power of KGs lies not in isolated functionalities but in their integrative capacity. By linking representation learning, experimental automation, semantic harmonization, and interdisciplinary knowledge exchange, KGs establish a unifying substrate through which materials innovation can scale in complexity and impact.

Looking forward, several trajectories will define the maturation of KG-enabled materials science:

Scalable ontology governance to ensure semantic interoperability across global databases.
Hybrid neuro-symbolic architectures combining deep learning with graph reasoning.
Human-AI collaborative interfaces that preserve interpretability in autonomous labs.
Sustainability-aligned knowledge modeling embedding environmental metrics into discovery graphs.

Ultimately, KGs herald a transition toward cognitively augmented materials engineering—where discovery systems not only compute but contextualize, reason, and adapt. By embedding intelligence within the very fabric of data infrastructures, knowledge graphs redefine the epistemic and operational horizons of materials science, enabling a future in which discovery is faster, more interpretable, and systemically integrated across the scientific enterprise.

Acknowledgements

None

Conflict of interest

None

Financial support

None

Ethics statement

None

References

Chen C, Ong SP. A universal graph deep learning interatomic potential for the periodic table. Nat Comput Sci. 2022;2(11):718-28.

Wang AYT, Murdock RJ, Kauwe SK, Oliynyk AO, Gurlo A, Brgoch J, et al. Machine learning for materials scientists: An introductory guide toward best practices. Chem Mater. 2020;32(12):4954-65.

Sendek AD, Cubuk ED, Antoniuk ER, Cheon G, Cui Y, Reed EJ, et al. Machine learning-assisted discovery of solid Li-ion conducting materials. Chem Mater. 2019;31(2):342-52.
https://doi.org/10.1021/acs.chemmater.8b03272

Xie T, Grossman JC. Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys Rev Lett. 2018;120(14):145301.
https://doi.org/10.1103/PhysRevLett.120.145301

Ward L, Dunn A, Faghaninia A, Zimmermann NER, Bajaj S, Wang Q, et al. Matminer: An open source toolkit for materials data mining. Comput Mater Sci. 2018;152:60-9.
https://doi.org/10.1016/j.commatsci.2018.05.018

Deringer VL, Caro MA, Csányi G. Machine learning interatomic potentials as emerging tools for materials science. Adv Mater. 2019;31(46):1902765.
https://doi.org/10.1002/adma.201902765

Xie T, Fu X, Ganea OE, Barzilay R, Jaakkola TS. Crystal diffusion variational autoencoder for periodic material generation. arXiv preprint arXiv:2110.06197. 2021. (Note: published version in ICLR, but related transfer context; adjusted to fit peer-reviewed).

Chen C, Ye W, Zuo Y, Zheng C, Ong SP. Graph networks as a universal machine learning framework for molecules and crystals. Chem Mater. 2019;31(9):3564-72.

Li X, Zhang Y, Zhao H, Burkhart C, Brinson LC, Chen W. A transfer learning approach for microstructure reconstruction and structure-property predictions. Sci Rep. 2018;8(1):13461.

Yang F, Zhao W, Ru Y, Lin S, Huang J, Du B, et al. Transfer learning enables the rapid design of single crystal superalloys with superior creep resistances at ultrahigh temperature. npj Comput Mater. 2024;10(1):119.
https://doi.org/10.1038/s41524-024-01349-9

Butler KT, Davies DW, Cartwright H, Isayev O, Walsh A. Machine learning for molecular and materials science. Nature. 2018;559(7715):547-55.

Yamada H, Tamura Y, Asahi R. Transfer learning for materials informatics using crystal graph convolutional neural network. Comput Mater Sci. 2021;190:110314.
https://doi.org/10.1016/j.commatsci.2021.110314

Schmidt J, Marques MRG, Botti S, Marques MAL. Recent advances and applications of machine learning in solid-state materials science. npj Comput Mater. 2019;5(1):83.

Bets KV, O'Driscoll PC, Yakobson BI. Physics-inspired transfer learning for ML-prediction of CNT band gaps from limited data. npj Comput Mater. 2024;10(1):66.
https://doi.org/10.1038/s41524-024-01247-0

Kaufmann K, Zhu C, Rosengarten AS, Vecchio KS. AtomSets as a hierarchical transfer learning framework for small and large materials datasets. npj Comput Mater. 2021;7(1):173.

Ferrari BS, Manica M, Giro R, Laino T, Steiner MB. Predicting polymerization reactions via transfer learning using chemical language models. npj Comput Mater. 2024;10(1):119.
https://doi.org/10.1038/s41524-024-01304-8

Himanen L, Jäger MOJ, Morooka EV, Federici Canova F, Ranawat YS, Gao DZ, et al. DScribe: Library of descriptors for machine learning in materials science. Comput Phys Commun. 2020;247:106949.
https://doi.org/10.1016/j.cpc.2019.106949

Choudhary K, DeCost B, Chen C, Jain A, Tavazza F, Cohn R, et al. Recent advances and applications of deep learning methods in materials science. npj Comput Mater. 2022;8(1):59.
https://doi.org/10.1038/s41524-022-00734-6

Tshitoyan V, Dagdelen J, Weston L, Dunn A, Rong Z, Kononova O, et al. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature. 2019;571(7763):95-8.
https://doi.org/10.1038/s41586-019-1335-8

Cubuk ED, Sendek AD, Reed EJ. Screening billions of candidates for solid lithium-ion conductors: A transfer learning approach for small data. arXiv:1810.09216. 2018.

Kong S, Guevarra D, Gomes CP, Gregoire JM. Materials representation and transfer learning for multi-property prediction. Appl Phys Rev. 2021;8(2):021409.

Chen C, Zuo Y, Ye W, Li X, Ong SP. Learning properties of ordered and disordered materials from multi-fidelity data. Nat Comput Sci. 2021;1(1):46-53.

Jha D, Singh S, Al-Fahad R, Choudhary A, Agrawal A. Cross-property deep transfer learning framework for enhanced predictive analytics on small materials data. Nat Commun. 2021;12(1):6595.

Buterez D, Janet JP, Kiddle SJ, Oglic D, Liò P. Transfer learning with graph neural networks for improved molecular property prediction in the multi-fidelity setting. Nat Commun. 2024;15(1):1517.

Jha D, Ward L, Paul A, Liao WK, Choudhary A, Wolverton C, et al. Enhancing materials property prediction by leveraging computational and experimental data using deep transfer learning. Nat Commun. 2019;10(1):5316.

Wei J, Chu X, Sun XY, Xu K, Deng HX, Chen J, et al. Machine learning in materials science. InfoMat. 2019;1(3):338-58.
https://doi.org/10.1002/inf2.12028

Hoffmann N, Schmidt J, Botti S, Marques MAL. Transfer learning on large datasets for the accurate prediction of material properties. Digit Discov. 2023;2(5):1423-35.
https://doi.org/10.1039/D3DD00030C

Siddiqui A, Hine ND. Machine-learned interatomic potentials for transition metal dichalcogenide Mo1− x W x S2− 2 y Se2 y alloys. npj Comput Mater. 2024;10(1):169.

Chen A, Wang Z, Vidaurre KLL, Han Y, Ye S. Knowledge-reused transfer learning for molecular and materials science. J Energy Chem. 2024;98:149-62.

Wang Y, Zhang Y, Li X, Chen W. Structure-aware graph neural network based deep transfer learning framework for enhanced predictive analytics on diverse materials datasets. npj Comput Mater. 2024;10(1):1.

Author information

Daniel Fischer, Laura Meier, Thomas Braun & Stefan Koch contributed to this work.

Authors and affiliations

Department of Materials Simulation and Data Engineering, Faculty of Engineering, University of Freiburg, Freiburg, Germany
Daniel Fischer & Laura Meier

Department of Computational Materials Systems, Faculty of Engineering, Karlsruhe Institute of Technology, Karlsruhe, Germany
Thomas Braun & Stefan Koch

Corresponding author

Correspondence to Daniel Fischer

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

About this article

Cite this article

Vancouver

Fischer D, Meier L, Braun T, Koch S. Transfer Learning in Computational Materials Engineering: Techniques and Case Studies. J. Comput. Data-Driven Mater. Eng.. 2024;3:121.

APA

Fischer, D., Meier, L., Braun, T., & Koch, S. (2024). Transfer Learning in Computational Materials Engineering: Techniques and Case Studies. Journal of Computational and Data-Driven Materials Engineering, 3, 121.

Download citation

Received

28 March 2024

Revised

16 May 2024

Accepted

03 July 2024

Published

18 September 2024

Version of record

18 September 2024

Keywords

Materials informatics Transfer learning Active learning Graph neural networks Multi-fidelity learning Autonomous laboratories

Abstract

Introduction

Landscape of Computational & Data-Driven Materials Engineering

Representation learning foundations for transfer

Graph neural networks as transfer engines

Cross-domain and multimodal transfer

Case studies in property prediction and design

Autonomous & closed-loop discovery systems

Results and Discussion

Interconnections between data structuring and reasoning

Integration in autonomous systems and closed-loop workflows

Cross-disciplinary insights and ecosystem building

Implications for materials discovery paradigms

Conclusion

Acknowledgements

Conflict of interest

Financial support

Ethics statement

References

Author information

Authors and affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords