Institute for Advanced Materials Research Press Institute for Advanced Materials Research Press

Search

Search results:
Algorithmic Confidence as a Control Signal in Materials Research
Materials research increasingly relies on machine learning to accelerate property prediction and discovery, yet the trustworthiness of these models remains constrained by their inability to express epistemic limitations. Algorithmic confidence—embodied in principled uncertainty quantification—provides a quantitative measure of model reliability that can extend beyond diagnostic assessment to serve as an active control signal within the research process. This conceptual manuscript synthesizes recent developments in uncertainty-aware machine learning, Bayesian approaches, and adaptive sampling strategies to argue that confidence estimates hold untapped potential as dynamic regulators of investigative workflows. Rather than treating uncertainty solely as a performance metric or sampling criterion, we conceptualize it as a central control variable that modulates decision pathways, balances exploration and exploitation, and informs the transition from computational prediction to empirical validation. A novel framework is proposed wherein algorithmic confidence governs iterative cycles in materials inquiry, enabling self-regulating mechanisms that align model assertions with epistemic boundaries. This perspective reframes uncertainty not as a limitation but as a strategic operator capable of guiding resource-efficient, robust materials exploration in a purely conceptual sense. By elevating confidence to a control role, the approach seeks to foster more deliberate and principled integration of computational intelligence into materials science paradigms.
Journal of Artificial Intelligence for Materials Science
Original Research | Open access | 18 January 2023 | Article: 21

Generative Models for Materials Science — Conceptual Capabilities and Scientific Limits: A Review Study
Generative models have emerged as transformative tools in materials science, enabling the inverse design of novel materials with tailored properties by learning from vast datasets of structures and compositions. This review synthesizes recent advancements in generative approaches, including variational autoencoders, generative adversarial networks, diffusion models, and large language models. It highlights their conceptual capabilities for accelerating discovery while addressing scientific limits such as data scarcity, synthesizability, and interpretability. By examining applications in inorganic crystals, organic molecules, and energy materials, we delineate how these models bridge computational efficiency with experimental validation, yet face challenges in generalizability and physical fidelity. Future directions emphasize hybrid physics-informed architectures and closed-loop automation to overcome current barriers and unlock sustainable materials innovation.
Journal of Artificial Intelligence for Materials Science
Review | Open access | 18 July 2023 | Article: 26

Scientific Decision-Making with Materials AI — How Models Actually Influence Action: A Review Study
The integration of artificial intelligence (AI) and machine learning (ML) in materials science has revolutionized traditional approaches to material discovery, design, and application. This narrative review explores how AI models not only predict material properties but also influence scientific decision-making by providing actionable insights, optimizing experimental strategies, and enabling inverse design paradigms. Drawing on recent advancements, we examine the transition from data-driven prediction to AI-assisted decision-making, highlighting case studies in porous materials, optoelectronics, and polymeric membranes. The review addresses challenges such as data scarcity, model interpretability, and integration with experimental workflows, while proposing future directions for AI to enhance human decision-making in materials research. Ultimately, AI is positioned as a collaborative tool that augments scientific intuition, accelerating innovation in sustainable and high-performance materials.
Journal of Artificial Intelligence for Materials Science
Review | Open access | 18 July 2023 | Article: 28

Active Learning-Driven Bayesian Optimization of Catalytic Nanoparticles for CO₂ Reduction
The escalating global challenge of carbon dioxide (CO₂) emissions necessitates innovative approaches to mitigate climate change through efficient catalytic conversion. This conceptual manuscript proposes a novel theoretical framework that integrates active learning with Bayesian optimization to enhance the design of catalytic nanoparticles for CO₂ reduction. Drawing on principles from machine learning and materials science, the framework addresses the complexities of high-dimensional parameter spaces in nanoparticle synthesis, such as size, shape, composition, and surface facets, which influence catalytic performance. By leveraging active learning to intelligently select informative data points and Bayesian optimization to refine surrogate models iteratively, the approach theoretically accelerates the identification of optimal nanoparticle configurations without empirical validation. The framework emphasizes uncertainty quantification and adaptive sampling to efficiently navigate the vast design space. This synthesis of concepts from recent literature highlights gaps in traditional optimization methods and posits that the proposed integration could conceptually reduce exploration costs while enhancing selectivity and activity in CO₂ reduction processes. The manuscript outlines theoretical underpinnings, a proposed framework, and implications for applied artificial intelligence in materials science, fostering future conceptual advancements in sustainable catalysis.
Journal of Artificial Intelligence for Materials Science
Original Research | Open access | 18 July 2023 | Article: 34

Recent Advances in Machine Learning-Accelerated Materials Discovery — From Descriptors to Autonomous Experiments
Machine learning (ML) has become a central driver of modern materials discovery, fundamentally reshaping how materials are designed, screened, and experimentally realized. This review examines recent advances in ML-accelerated materials discovery and emphasizes the ongoing progress in material representation and descriptor development toward fully autonomous experimental platforms. We discuss how increasingly sophisticated descriptors—ranging from composition-based features and structure-aware representations to ab initio–derived and learned embeddings—have improved predictive accuracy, data efficiency, and physical interpretability across diverse materials systems. Based on these findings, we discuss the evolution of ML frameworks for property prediction, classification, and inverse design, with particular attention to uncertainty-aware modeling, multiobjective optimization, and explainable learning strategies that bridge predictive performance with scientific insight. The study also highlights the growing role of active learning and generative models in efficiently navigating vast chemical and structural spaces, enabling data-efficient exploration and hypothesis-driven discovery. At the frontier of these developments, autonomous experimental systems integrate ML with robotics to form closed-loop workflows that iteratively design, execute, and refine experiments with minimal human intervention. Applications spanning perovskites, alloys, energy materials, and nanostructures illustrate the broad impact of these approaches in overcoming traditional trial-and-error limitations. Finally, we discuss persistent challenges associated with data scarcity, extrapolation, interpretability, and system integration, and outline future directions toward more robust, scalable, and sustainable autonomous materials discovery. Collectively, these advances represent a paradigm shift from passive data-driven prediction to intelligent, self-guided materials innovation.
Journal of Artificial Intelligence for Materials Science
Review | Open access | 18 January 2024 | Article: 41

Interpreting Materials Data with Artificial Intelligence: From Prediction to Scientific Understanding
The integration of artificial intelligence (AI) and machine learning (ML) into materials science has fundamentally transformed how material properties are predicted, analyzed, and understood. While early data-driven approaches emphasized predictive accuracy and high-throughput screening, recent advances are increasingly focusing on interpretability and explainability, enabling AI models to contribute to mechanistic scientific insight rather than functioning as opaque black boxes. This study examines the evolution of interpretable AI in materials science and highlights the transition from property prediction to explanation-driven understanding of structure–property relationships. In this thesis, we investigate the progress in machine learning frameworks that operate with limited or implicit structural information, alongside the growing use of explainable AI (XAI) techniques to uncover physically meaningful descriptors, atomic-scale interactions, and microstructural drivers of material behavior. Methods such as graph-based learning, attention mechanisms, feature attribution, and uncertainty-aware modeling are discussed for their ability to improve model reliability, expose data bias, and guide hypothesis generation. Representative applications across alloys, perovskites, organic semiconductors, and ferroelectric materials demonstrate how interpretable models have revealed governing mechanisms spanning atomic, mesoscopic, and macroscopic length scales. Beyond individual case studies, this study examines persistent challenges in interpretable materials AI, including data quality, generalizability, explanation stability, and computational overhead. We argue that interpretability is not merely an auxiliary feature but a prerequisite for trustworthy and scientifically helpful AI in materials research. By synthesizing recent methodological and application-driven advances, this review positions interpretable AI as a critical enabler of mechanism-oriented discovery, experimental validation, and theory development, ultimately advancing AI from a predictive accelerator to an integral partner in scientific understanding.
Journal of Artificial Intelligence for Materials Science
Review | Open access | 18 January 2024 | Article: 42

A Conceptual Blueprint for “Digital Materials Twins” without Simulation: Definitions, Boundaries, and Use-Cases
The advent of digital twins has revolutionized various engineering domains, yet their application in materials science often relies heavily on computationally intensive simulations to replicate physical behaviors. This conceptual paper introduces “Digital Materials Twins” (DMTs) as a novel paradigm that eschews traditional simulation in favor of purely data-driven representations. DMTs leverage artificial intelligence and machine learning to create virtual counterparts of materials based solely on empirical data, enabling efficient prediction and analysis without physics-based modeling. Drawing on recent advances in data-driven materials science, we define DMTs as dynamic, data-centric models that capture material properties, structures, and responses by learning from diverse datasets. We delineate their boundaries, emphasizing limitations in real-time dynamics and in extrapolation beyond the trained data regime. By synthesizing the literature on digital twins and AI in materials, we propose a conceptual framework comprising data ingestion, feature extraction, model training, and inference. This framework enables use cases in accelerated materials design, property prediction, and optimization across sectors such as energy storage and additive manufacturing. By prioritizing conceptual innovation over empirical validation, this blueprint aims to guide future theoretical developments and foster scalable, simulation-free approaches to materials innovation. The implications for high-impact applications in applied artificial intelligence are discussed, highlighting DMTs’ potential to democratize materials research.
Journal of Artificial Intelligence for Materials Science
Original Research | Open access | 18 July 2024 | Article: 56

Conceptual Foundations of Applied AI in Materials Science - Definitions, Assumptions, and Open Debates
The rapid integration of artificial intelligence (AI) into materials science marks a profound shift in how materials are discovered, characterized, and optimized. Rather than functioning merely as a computational aid, AI increasingly operates as an epistemic instrument that reshapes scientific workflows, decision-making practices, and notions of explanation within the field. This narrative review examines the conceptual foundations underpinning applied AI in materials science, with a particular focus on core definitions, implicit and explicit assumptions, and unresolved debates that continue to shape the domain. Key AI paradigms—including supervised, unsupervised, and reinforcement learning—are situated within materials-specific contexts such as property prediction, structure–property mapping, and autonomous experimentation. The review critically interrogates foundational assumptions regarding data quality, representativeness, generalization, and model transferability, highlighting how these assumptions condition both the successes and failures of AI-driven materials research. Persistent debates surrounding interpretability, epistemic trust, ethical responsibility, and environmental sustainability are synthesized from recent literature published. By articulating both the transformative potential and the conceptual limitations of applied AI, this review underscores the necessity of rigorous validation, transparent reasoning, and interdisciplinary collaboration to ensure that AI contributes robustly and responsibly to materials innovation.
Journal of Artificial Intelligence for Materials Science
Original Research | Open access | 18 July 2024 | Article: 63

Uncertainty and Reliability in Materials AI — Concepts, Language, and Decision Consequences
The integration of artificial intelligence (AI) and machine learning (ML) into materials science has accelerated the discovery and design of novel materials by enabling high-throughput prediction of properties from composition, structure, and processing parameters. However, the reliability of these predictions is frequently compromised by uncertainties stemming from limited datasets, model approximations, experimental noise, and intrinsic variability in materials systems. This narrative review synthesizes recent advances in understanding uncertainty and reliability in materials AI. It covers fundamental concepts such as aleatoric and epistemic uncertainty; methods for quantification, including Bayesian neural networks, ensembles, and Gaussian processes; inconsistencies in terminology and language across the literature; and the downstream consequences for decision-making in materials engineering, design, and deployment. Emphasis is placed on calibration of uncertainty estimates, domain-of-applicability assessment, and risk-aware applications in safety-critical contexts such as structural alloys and energy materials. By highlighting best practices and gaps, the review advocates for standardized frameworks to build trust and facilitate industrial translation of materials AI. Key challenges include data scarcity in high-performance materials and the need for physics-informed UQ to mitigate overconfidence in extrapolative predictions. This synthesis underscores the importance of robust uncertainty handling for responsible AI deployment in materials innovation.
Journal of Artificial Intelligence for Materials Science
Review | Open access | 18 January 2025 | Article: 65

Causality in Materials Informatics — Conceptual Progress, Limitations, and Future Directions
Materials informatics has emerged as a central paradigm in contemporary materials science, leveraging machine learning and data-driven modeling to accelerate materials discovery, optimization, and deployment. Despite substantial advances in predictive accuracy, most existing approaches remain fundamentally correlational, limiting their reliability under distribution shifts, experimental interventions, and real-world deployment scenarios. This reliance on correlation constrains scientific interpretability and undermines the capacity of AI systems to function as genuine instruments of materials reasoning. Causality offers a principled framework for overcoming these limitations by explicitly modeling cause-and-effect relationships among composition, processing, structure, and properties. This narrative review synthesizes conceptual progress in integrating causal inference into materials informatics, examining foundational causal frameworks, advances in causal discovery, and hybrid causal–machine learning approaches, and emerging applications across materials domains such as nanocatalysis, ferroelectrics, and electrochemical energy storage. We critically analyze persistent challenges—including data scarcity, assumption violations, limited external validity, and computational and epistemic constraints—that currently hinder widespread adoption. Drawing exclusively on peer-reviewed literature published, the review emphasizes thematic and epistemic developments rather than algorithmic prescriptions. We argue that causality represents a structural shift in how AI systems contribute to materials science: from correlational predictors to intervention-aware, mechanism-aligned reasoning tools. By articulating future directions centered on hybrid modeling, domain-knowledge integration, and interdisciplinary collaboration, this review positions causality as a necessary foundation for robust, generalizable, and scientifically legitimate materials informatics.
Journal of Artificial Intelligence for Materials Science
Review | Open access | 18 January 2025 | Article: 66

Physics as Constraint, Not Input: A Conceptual Reframing of Physics-Guided Machine Learning in Materials Science
Physics-guided machine learning (PGML) has emerged as a hybrid paradigm in materials science, integrating domain knowledge with data-driven methods to enhance predictive accuracy and generalizability. Conventional approaches typically embed physical principles as soft inputs—either through loss-function regularization or auxiliary features—allowing violations during optimization. This manuscript advances a conceptual reframing in which physics operates as a hard constraint on the model’s hypothesis space rather than as an additive input. By restricting permissible functional forms, symmetries, and conservation relations a priori, the framework enforces physical consistency at the architectural level, altering the interaction dynamics between data and prior knowledge. The reframing yields systems-level insights into epistemic trade-offs: reduced reliance on large datasets, improved extrapolation beyond training regimes, and inherent satisfaction of thermodynamic or mechanical invariants critical to materials behavior. Analytical implications include feedback structures that couple data refinement to constraint satisfaction, revealing emergent robustness in multiscale modeling. This perspective addresses persistent challenges in materials science, such as sparse experimental data and complex microstructure-property relationships, without resorting to empirical validation. The contribution lies in reinterpreting PGML’s epistemic foundation, steering future developments toward constraint-centric designs that prioritize physical fidelity over post-hoc penalization.
Journal of Artificial Intelligence for Materials Science
Original Research | Open access | 18 January 2025 | Article: 70

AI-Mediated Hypothesis Generation in Materials Science: A Conceptual Framework for Scientific Creativity
The integration of artificial intelligence (AI) into materials science represents a paradigm shift in how scientific creativity is manifested and harnessed. This conceptual paper develops a novel theoretical framework for understanding AI-mediated hypothesis generation, emphasizing its role in enhancing scientific creativity within materials discovery and design. Traditional hypothesis generation in materials science relies on human intuition, empirical observation, and theoretical deduction, often constrained by cognitive limitations and the vast complexity of material systems. AI, through machine learning algorithms and generative models, augments this process by enabling rapid pattern recognition, simulation of hypothetical scenarios, and exploration of uncharted chemical spaces. The proposed framework, termed the symbiotic creativity cycle (SCC), posits a dynamic interplay between human and AI agents, where AI serves as a cognitive amplifier, facilitating divergent exploration and convergent refinement of hypotheses. This cycle incorporates iterative feedback loops that integrate domain knowledge with data-driven insights, fostering emergent creativity that transcends individual capabilities. Key elements includeAI’s ability to handle multidimensional data, predict material properties, and generate novel conceptual blends. The framework highlights potential applications for accelerating discoveries in advanced alloys, nanomaterials, and energy storage materials, while addressing challenges such as interpretability and ethical integration. By reconceptualizing scientific creativity as a hybrid human-AI endeavor, this paper lays the foundation for future theoretical developments and practical applications in applied artificial intelligence for materials science. Ultimately, AI-mediated hypothesis generation promises to democratize innovation, enabling more efficient navigation of the materials design landscape.
Journal of Artificial Intelligence for Materials Science
Original Research | Open access | 18 July 2025 | Article: 77

Small-Data and Sparse-Regime Learning in Materials AI — Methods, Assumptions, and Limits
The integration of artificial intelligence (AI) and machine learning (ML) into materials science, often referred to as materials informatics or materials AI, has accelerated the discovery, design, and optimization of advanced materials. However, materials science frequently operates in small-data and sparse-regime conditions, where datasets are limited in size (often tens to hundreds of samples), high-dimensional, imbalanced, or sparsely populated due to the high cost, time, and complexity of experimental measurements and high-fidelity simulations. This narrative review synthesizes recent advances in methods tailored to these constraints, categorizing approaches at the data-source level (e.g., literature extraction, database construction, high-throughput workflows), algorithmic level (e.g., support vector machines, Gaussian process regression, ensemble models, imbalanced learning techniques), and strategic level (e.g., active learning, transfer learning). Key assumptions underlying these methods are examined, including similarity between source and target domains for transfer learning, representativeness of initial samples and reliable uncertainty quantification in active learning, and the validity of physical priors or inductive biases in physics-informed approaches. The review also addresses inherent limits, such as risks of overfitting, poor generalization beyond the training distribution, sensitivity to data quality and noise, challenges in uncertainty calibration, and dependence on domain expertise. By highlighting successful applications in property prediction, alloy design, and perovskite optimization, this work elucidates the current capabilities and boundaries of small-data and sparse-regime learning in materials AI, guiding researchers navigating data-limited environments.
Journal of Artificial Intelligence for Materials Science
Review | Open access | 18 January 2026 | Article: 90

Physics-Integrated Machine Learning for Materials Science — Conceptual Taxonomies and Open Questions
The integration of physical principles into machine learning (ML) frameworks has emerged as a transformative approach in materials science, addressing the limitations of purely data-driven models by incorporating domain knowledge to enhance predictive accuracy, generalizability, and interpretability. This narrative review explores the conceptual taxonomies of physics-integrated ML methods, their applications in materials discovery and design, and the associated challenges in data bias and ethical considerations. Drawing on recent peer-reviewed literature, we classify physics-integration strategies such as physics-informed neural networks (PINNs), hybrid models combining ML with physical simulations, and constraint-based learning, and highlight their roles in solving complex problems such as material property prediction, microstructure analysis, and phase stability. We also examine how data biases in training datasets can propagate errors and inequities in model outputs, and discuss the ethical values underpinning the use of AI in scientific research, including transparency, accountability, and societal impact. The review underscores the potential of these methods to accelerate innovation in materials science while emphasizing the need for rigorous validation and interdisciplinary collaboration. By synthesizing current advancements, this article aims to provide a foundational understanding for researchers and practitioners, paving the way for future developments in this interdisciplinary field.
Journal of Artificial Intelligence for Materials Science
Review | Open access | 18 January 2026 | Article: 91

Conceptual Foundations for Adversarial Validation in Materials Machine Learning
Standard validation protocols in materials machine learning continue to rely on the assumption that training and test data are drawn from the same underlying distribution. This assumption is almost invariably violated in real-world materials datasets because of temporal drift in measurement techniques, compositional biases in database construction, and experimental confounders arising from different laboratories and instruments. This conceptual framework article proposes adversarial validation as a diagnostic tool specifically tailored for materials informatics: a method that trains a discriminator to explicitly detect whether a distribution shift exists between any two datasets, thereby revealing hidden generalization failures that conventional train-test splits and k-fold cross-validation cannot expose. The framework introduces the conceptual foundations of adversarial validation, distinguishes it from adversarial attacks, articulates why the technique is particularly powerful in the small-data, high-dimensional, and physically constrained domain of materials science, and offers a five-component structure for its systematic application—feature-space definition, classifier selection, shift-detection thresholding, localization of driving features, and actionable response rules. By embedding materials-specific domain knowledge into the interpretation of discriminator performance, the approach transforms validation from a passive checkpoint into an active diagnostic that can distinguish temporal shift from compositional bias and experimental confounding. The implications for materials AI practice are immediate and transformative: researchers can now report adversarial validation results alongside standard metrics, trigger targeted dataset augmentation or model retraining when shifts are detected, and document potential sources of distribution mismatch in experimental workflows, ultimately raising the robustness and trustworthiness of property predictions that underpin materials discovery and design.
Journal of Artificial Intelligence for Materials Science
Original Research | Open access | 18 January 2022 | Article: 99

The Treatment of Absence and Null Results in Materials Machine Learning Literature: A Review Study
This review systematically examines the treatment of absence and null results in the materials machine learning literature spanning 2017–2022, drawing exclusively on a curated set of 30 peer-reviewed publications and foundational works that address publication bias, negative findings, and reproducibility challenges in data-driven materials discovery. Through a targeted search strategy across databases such as Web of Science, Scopus, and arXiv using terms including “null result,” “negative result,” “publication bias,” “file drawer,” “failed synthesis,” and “reproducibility” combined with materials informatics keywords, the analysis reveals a persistent imbalance: while successful predictions and syntheses dominate published outputs, systematic documentation of failed predictions, unsuccessful syntheses, null correlations, and abandoned model architectures remains exceedingly rare. What is currently reported tends to be limited to negative outcomes that coincidentally reveal mechanistic insights or contradict high-profile hypotheses, whereas what is systematically unreported encompasses the vast majority of unsuccessful hyperparameter searches, negative active learning campaigns, and non-discoveries that yield no novel materials meeting target criteria. The typology of absence and null results developed here identifies six distinct categories—negative predictive outcomes, null hypothesis non-rejection, failed synthesis, non-discovery, failed replication, and abandoned architecture—each carrying unique implications for scientific progress. The consequences of this non-reporting include severe overestimation of model performance, widespread redundant experimental effort, a false sense of methodological consensus across the field, and slowed overall discovery rates as potentially informative negative signals remain invisible. Ultimately, this review offers concrete recommendations for authors, journals, and the broader community to shift incentives toward transparent reporting of absence, thereby restoring balance to the materials AI literature and accelerating reliable data-driven discovery.
Journal of Artificial Intelligence for Materials Science
Review | Open access | 18 July 2022 | Article: 105

A Conceptual Distinction between Generalization and Transfer in Materials Machine Learning
In the rapidly expanding domain of artificial intelligence applied to materials science, a persistent conceptual ambiguity undermines the reliability of reported model capabilities. The terms “generalization” and “transfer” are routinely conflated, with authors claiming that a model “generalizes” when it is in fact being evaluated on samples drawn from a distinctly different distribution. This boundary/definitional paper draws a sharp conceptual distinction between the two notions. Generalization is defined as the expected performance of a trained model on new samples drawn independently and identically from the same underlying distribution as the training data. In contrast, transfer is defined as performance on samples drawn from a different distribution, where the I.I.D. assumption is violated by construction. The distinction matters because a model that generalizes excellently within its training distribution can fail dramatically under transfer conditions, and conversely, a successful transfer mechanism may mask poor generalization; treating the two interchangeably, therefore, produces overclaims about model robustness that cannot be sustained when materials discovery moves beyond the convex hull of available training data. The paper articulates a two-dimensional boundary framework—distribution-shift magnitude and feature-space overlap—that locates any given evaluation setting along a continuum from pure generalization to pure transfer, thereby enabling authors, reviewers, and practitioners to specify precisely which capability is being claimed and tested. By clarifying these boundaries and exposing the epistemic costs of current usage, the work supplies a conceptual foundation for more disciplined reporting standards and evaluation protocols in materials machine learning.
Journal of Artificial Intelligence for Materials Science
Original Research | Open access | 18 January 2023 | Article: 108

The Handling of Domain Shift in Materials Machine Learning Literature: A Review Study
This review systematically examines the handling—or more often the neglect—of domain shift within the materials machine learning literature published between 2017 and 2023, drawing on a targeted search of peer-reviewed publications across specialized databases and journals to compile and analyze exactly 30 representative studies that span foundational overviews, application-focused works, and methodological explorations. Domain shift in materials science takes four distinct yet interrelated forms—temporal, compositional, experimental, and theoretical—each arising from the inherently heterogeneous nature of materials data sources that range from evolving laboratory protocols and diverse chemical families to inter-laboratory variations and discrepancies between computational approximations and experimental realities. Current practices reveal that explicit acknowledgment of domain shift remains rare, with the majority of papers proceeding under the default assumption of identical training and test distributions. At the same time, detection methods and adaptation strategies appear in fewer than one in five studies, leaving models vulnerable to silent degradation when deployed on real-world materials problems. The surveyed methods for handling domain shift include statistical detection techniques, domain-adversarial training frameworks, feature-alignment approaches, and shift-robust evaluation protocols, many of which have been proposed in adjacent machine-learning fields yet remain underutilized in materials contexts despite their direct relevance to property prediction and inverse design tasks. Collectively, these findings underscore the urgent need for standardized shift-reporting protocols, the development of materials-specific out-of-distribution benchmarks, and the integration of domain-adaptation pipelines into routine workflows, thereby elevating the reliability, generalizability, and practical utility of machine-learning models in accelerating materials discovery.
Journal of Artificial Intelligence for Materials Science
Review | Open access | 18 July 2023 | Article: 117

Algorithmic Path Dependency as Scientific Path Dependency: A Conceptual Link
In the rapidly evolving domain of artificial intelligence for materials science, path dependency remains a critically overlooked phenomenon that shapes both computational pipelines and the broader scientific enterprise. Algorithmic path dependency manifests when seemingly innocuous early choices in neural network initialization, training data ordering, hyperparameter selection, feature descriptor definition, or early stopping criteria create irreversible constraints on subsequent model behaviors and outputs, as evidenced in recurrent neural network architectures designed for heterogeneous materials. Scientific path dependency, by contrast, arises in the history and philosophy of science when initial decisions regarding research problems, material systems, theoretical frameworks, experimental protocols, or funding priorities lock research communities into particular trajectories, rendering alternative avenues increasingly difficult to pursue even when they might yield superior insights. This paper advances the theoretical claim that algorithmic path dependency propagates directly into scientific path dependency within materials AI, such that technical decisions made at the level of code and data become de facto determinants of which materials are discovered, which questions are asked, and which knowledge ultimately enters the scientific canon. The linkage operates through identifiable mechanisms, including output filtering, resource allocation, knowledge representation, and publication bias, each amplifying the long-term scientific consequences of early algorithmic commitments. By drawing upon foundational economic concepts of increasing returns and historical contingency alongside contemporary literature in machine learning for materials, this theoretical analysis proposes that materials AI researchers must explicitly recognize these dynamics to avoid unintended lock-in effects that could limit the diversity and robustness of future discoveries. The analysis further derives corollaries concerning constrained output diversity, the practical irreversibility of certain scientific paths, and the necessity of methodological pluralism, offering concrete implications for research practice, peer review standards, and community norms. Ultimately, this conceptual linkage reframes early algorithmic decisions not as mere technical details but as foundational scientific commitments whose consequences reverberate through the entire materials discovery ecosystem.
Journal of Artificial Intelligence for Materials Science
Original Research | Open access | 18 January 2024 | Article: 121

Property Prediction vs Mechanistic Insight: A Conceptual Divide in Materials AI
In computational materials engineering, the integration of artificial intelligence (AI) has transformed discovery pipelines from labor-intensive simulations to data-driven infrastructures capable of navigating vast chemical spaces. High-throughput computations and machine learning architectures, such as graph neural networks, have enabled rapid property prediction, accelerating the screening of candidates for applications ranging from energy storage to structural alloys. Yet, this paradigm emphasizes forward modeling—mapping inputs to outputs—often at the expense of mechanistic insight, which requires disentangling causal interactions within atomic-scale dynamics. The conceptual divide between property prediction and mechanistic insight manifests in epistemic tensions: predictive models excel in interpolation but falter in extrapolation, while insight-oriented approaches demand representations that encode not just structural motifs but relational hierarchies across scales. This manuscript introduces the Interpretive Cascade Framework, a systems-level conceptualization that reframes materials AI as a layered cascade of representation, inference, and steering logics. By integrating multimodal data streams with feedback-mediated discovery workflows, the framework elucidates how computational infrastructures can balance predictive efficiency with interpretive depth, mitigating risks of epistemic opacity in closed-loop experimentation. Structural layers delineate data ingestion to hypothesis refinement, incorporating uncertainty propagation as a steering mechanism rather than a mere byproduct. Implications for the field lie in reorienting AI ecosystems toward hybrid discovery logics, where representation learning informs inverse design without sacrificing traceability. This interpretive lens fosters resilient infrastructures, enabling materials science to evolve beyond black-box predictions toward epistemically robust computational paradigms that sustain long-term innovation in data-driven materials engineering.
Journal of Computational and Data-Driven Materials Engineering
Original Research | Open access | 18 September 2022 | Article: 90

Algorithmic Novelty vs Chemical Novelty: Rethinking Innovation Metrics
In the evolving landscape of computational and data-driven materials engineering, innovation is increasingly driven by the interplay between algorithmic advancements and chemical discoveries. Traditional metrics often conflate these dimensions, overlooking how machine learning architectures, such as graph neural networks and representation learning, enable high-throughput computation while potentially prioritizing computational efficiency over substantive material breakthroughs. This conceptual gap hinders a nuanced understanding of progress in materials informatics, where autonomous discovery systems and closed-loop experimentation integrate simulation-experiment coupling with uncertainty quantification. Here, we introduce the Algorithmic-Chemical Novelty Duality Framework (ACNDF), a novel interpretive structure that disentangles algorithmic novelty—encompassing innovations in deep learning architectures and multimodal datasets—from chemical novelty, focused on inverse design and emergent material properties. By emphasizing systems-level insights into representation-inference interactions and epistemic risk structures, ACNDF reorients innovation metrics toward balanced discovery steering logics. This framework highlights infrastructure trade-offs in foundation models for science, fostering more integrative workflows. Implications extend to enhancing predictive analytics and transfer learning across small data regimes, ultimately guiding computational ecosystems toward sustainable innovation in materials engineering.
Journal of Computational and Data-Driven Materials Engineering
Original Research | Open access | 18 March 2023 | Article: 94

Compositional Space Is Not Uniform: Density Gradients in Data-Driven Screening
In the evolving landscape of computational and data-driven materials engineering, the exploration of compositional spaces has become central to accelerating materials discovery. Traditional approaches often assume uniformity in these spaces, treating them as isotropic domains where data points are evenly distributed and equally informative. However, real-world datasets exhibit inherent density gradients, where regions of high data concentration contrast with sparse zones, influencing the reliability of machine learning predictions and high-throughput screening outcomes. This non-uniformity arises from biases in experimental sourcing, computational feasibility constraints, and intrinsic material stability landscapes, leading to epistemic risks in inverse design and autonomous discovery pipelines. To address this conceptual gap, we introduce the Density-Gradient Adaptive Screening (DGAS) Framework, a novel interpretive structure that integrates gradient-aware representation learning with adaptive sampling logics to navigate these heterogeneous spaces. The framework conceptualizes compositional domains as multi-layered manifolds with varying informational densities, incorporating feedback mechanisms between data ingestion, model inference, and discovery steering. By formalizing density gradients as dynamic modulators of uncertainty propagation, DGAS offers systems-level insights into optimizing closed-loop experimentation and multimodal dataset curation. Implications extend to foundation models in materials science, enhancing simulation-experiment coupling and reducing extrapolation errors in underrepresented compositional regimes. This work underscores the need for gradient-centric paradigms in materials informatics, fostering more robust and efficient pathways toward next-generation materials.
Journal of Computational and Data-Driven Materials Engineering
Original Research | Open access | 18 March 2023 | Article: 96

Multimodal, Physics-Informed Machine Learning for Accelerated Materials Design and Discovery
In the evolving landscape of computational materials engineering, the integration of multimodal data sources with physics-informed machine learning paradigms promises to revolutionize the pace and precision of materials design and discovery. This conceptual manuscript explores the synergies between diverse data modalities—ranging from experimental spectra to simulation-derived properties—and machine learning models constrained by physical laws, aiming to address persistent challenges in data scarcity, model generalizability, and discovery efficiency within materials science. By synthesizing recent advancements in representation learning, graph neural networks, and autonomous systems, we identify a conceptual gap in holistic frameworks that unify multimodal inputs with physics-based priors for accelerated inverse design. We introduce a novel conceptual framework, termed the Multimodal Physics-Constrained Discovery Engine (MPCDE), which structures data-model-discovery pipelines through layered interactions, feedback mechanisms, and epistemic steering logics. This framework emphasizes computational workflows that balance representation fidelity with inference robustness, incorporating uncertainty quantification to mitigate risks in high-throughput settings. Implications for the field include enhanced coupling of simulation and experimentation, improved scalability of foundation models, and streamlined closed-loop discovery systems. Ultimately, this work posits interpretive insights into how such integrated approaches can transform materials informatics into a more predictive and autonomous discipline, fostering innovations in energy, electronics, and structural materials.
Journal of Computational and Data-Driven Materials Engineering
Original Research | Open access | 18 March 2023 | Article: 100

Optimization without Causality: Limits of Correlation-Driven Materials Design
In the evolving landscape of computational and data-driven materials engineering, machine learning techniques have revolutionized the discovery and optimization of materials by leveraging vast datasets to identify patterns and correlations. However, this reliance on correlation-driven approaches often overlooks the underlying causal mechanisms that govern material properties and behaviors, leading to inherent limitations in the generalizability and robustness of designed materials. This manuscript explores the conceptual boundaries of optimization strategies that prioritize statistical associations over causal understanding within materials informatics ecosystems. We introduce a novel conceptual framework, termed the Correlation Boundary Architecture (CBA), which delineates the epistemic constraints imposed by correlation-centric pipelines in materials design. The CBA integrates representation learning, inference dynamics, and feedback structures to highlight how data-driven optimizations can falter in extrapolative scenarios, such as novel chemical spaces or extreme conditions. By synthesizing recent advancements in graph neural networks, high-throughput computations, and uncertainty quantification, we articulate the trade-offs between computational efficiency and causal fidelity. Implications extend to autonomous discovery systems and inverse design paradigms, suggesting pathways for hybrid frameworks that mitigate correlation biases through enhanced interpretive layers. This work underscores the need for computational steering logics that balance correlative power with causal awareness, fostering more resilient materials engineering practices.
Journal of Computational and Data-Driven Materials Engineering
Original Research | Open access | 18 September 2023 | Article: 101

Simulation Priors in Machine Learning Materials Models: Hidden Physics Assumptions
The integration of machine learning into materials engineering has transformed discovery pipelines by leveraging vast simulation-generated datasets and high-throughput computational workflows. Within this data-driven paradigm, models frequently incorporate simulation priors—implicit assumptions derived from physical approximations, boundary conditions, and discretization choices embedded in first-principles calculations or molecular dynamics trajectories. These priors, often hidden within representation learning and graph-based architectures, introduce epistemic biases that propagate through inference to downstream tasks such as inverse design and closed-loop experimentation. A key conceptual gap lies in the lack of systematic frameworks for articulating and managing these assumptions as integral components of the computational infrastructure rather than incidental data artifacts. This article introduces the Simulation Prior Articulation Framework (SPAF), an original systems-level conceptual structure that delineates layered processing of multimodal materials data, explicit prior extraction from simulation ecosystems, integration into deep learning architectures, and steering of discovery pipelines via feedback mechanisms. SPAF emphasizes representation–inference interactions, computational workflow dynamics, and infrastructure trade-offs to enhance simulation–experiment coupling without empirical benchmarking. By framing hidden physics assumptions as addressable epistemic structures, the framework provides integrative insights for materials informatics, foundation models, and autonomous discovery systems, supporting more transparent and robust data-driven materials engineering pipelines.
Journal of Computational and Data-Driven Materials Engineering
Original Research | Open access | 18 September 2023 | Article: 103

Computational and Data-Driven Materials Engineering: High-Throughput Computational Screening Platforms, Workflows, and Discovery Outcomes
The field of computational and data-driven materials engineering has undergone rapid evolution, driven by advancements in high-throughput computational screening, machine learning algorithms, and integrated workflows that accelerate materials discovery. This review synthesizes recent developments in materials informatics, focusing on platforms that enable efficient exploration of vast chemical spaces through automated computations and data analytics. Key areas include the application of graph neural networks and representation learning for property prediction, active learning strategies to optimize experimental feedback loops, and the integration of multimodal datasets for enhanced model accuracy. High-throughput methods have facilitated discoveries in diverse domains, such as superconductors, battery materials, and high-entropy alloys, by combining density functional theory simulations with machine learning surrogates. Autonomous laboratories and closed-loop systems represent a paradigm shift, allowing self-driving experiments that minimize human intervention while maximizing discovery efficiency. Uncertainty quantification plays a critical role in guiding these processes, ensuring reliable predictions amid sparse data. This narrative review structures the landscape into computational ecosystems, workflow integrations, and discovery outcomes, highlighting cross-study synergies. It positions the field at the cusp of scalable, inverse design paradigms, where data-driven insights bridge simulation and experimentation to address grand challenges in materials science.
Journal of Computational and Data-Driven Materials Engineering
Review | Open access | 18 September 2023 | Article: 105

Computational and Data-Driven Materials Engineering: Multimodal Materials Datasets, Integration Frameworks, and Discovery Potential
The field of computational and data-driven materials engineering has transformed from traditional high-throughput simulations to sophisticated ecosystems integrating machine learning with multimodal datasets for accelerated discovery. This review synthesizes recent advancements in materials informatics, emphasizing the role of graph neural networks and deep learning in processing complex structural and property data. We examine multimodal datasets that combine experimental, computational, and textual modalities, enabling robust representation learning and uncertainty quantification. Integration frameworks are discussed, including active learning loops and multi-fidelity models that bridge simulation and experiment, addressing challenges like data sparsity and distribution shifts. The discovery potential is highlighted through applications in property prediction, inverse design, and autonomous systems, such as identifying stable alloys and energy materials. By providing an original synthesis of these elements, this article underscores the shift toward closed-loop workflows that enhance generalizability and interpretability, while identifying gaps in handling finite-temperature stability and disordered systems. Ultimately, these approaches promise to expand the known materials space by orders of magnitude, fostering innovations in sustainable technologies.
Journal of Computational and Data-Driven Materials Engineering
Review | Open access | 18 September 2023 | Article: 106

Data-Driven Materials Engineering: Inverse Design Strategies, Machine Learning Architectures, and Application Domains
The advent of data-driven approaches has revolutionized materials engineering, enabling inverse design strategies that prioritize target properties to guide material synthesis and optimization. This review synthesizes recent advancements in machine learning architectures tailored for materials informatics, including graph neural networks and representation learning frameworks that capture atomic-scale interactions and multiscale phenomena. We examine the integration of high-throughput computations with experimental workflows, highlighting closed-loop systems that incorporate active learning and uncertainty quantification to accelerate discovery. Key application domains span energy materials, metamaterials, and catalytic systems, where multimodal datasets facilitate simulation-experiment synergies. By analyzing computational ecosystems, we underscore the shift from forward modeling to inverse paradigms, emphasizing autonomous laboratories that iteratively refine hypotheses through data feedback loops. Challenges in generalizability and data scarcity are contextualized within broader systems integration, offering a cohesive perspective on how these tools reshape materials design. This narrative integrates cross-study insights to propose unified frameworks for scalable, data-centric engineering, bridging theoretical models with practical implementations in computational materials science.
Journal of Computational and Data-Driven Materials Engineering
Review | Open access | 18 September 2023 | Article: 107

Algorithmic Consensus vs Scientific Consensus in Materials Prediction
In the rapidly evolving field of computational and data-driven materials engineering, the interplay between algorithmic processes and established scientific paradigms shapes the reliability of predictive outcomes. Traditional scientific consensus emerges from iterative experimental validation, peer review, and cumulative evidence, fostering a shared understanding of material behaviors and properties. In contrast, algorithmic consensus arises from the aggregation of computational models, often leveraging machine learning architectures to distill patterns from vast datasets. This manuscript explores the tensions and synergies between these two forms of consensus in materials prediction, highlighting how data-driven approaches can either reinforce or challenge longstanding scientific interpretations. A conceptual gap persists in integrating these consensus mechanisms, where algorithmic outputs may diverge from empirical benchmarks due to representation biases or uncertainty propagation. To address this, we introduce the Consensus Integration Lattice (CIL), a novel framework that structures the alignment of algorithmic and scientific consensus through layered computational workflows, feedback mechanisms, and epistemic risk assessments. By conceptualizing discovery pipelines that couple high-throughput simulations with multimodal data integration, CIL facilitates more robust materials predictions. Implications extend to autonomous discovery systems, inverse design strategies, and uncertainty quantification, potentially enhancing the efficiency of materials informatics ecosystems. This work underscores the need for infrastructure-level analyses to bridge computational agility with scientific rigor, paving the way for hybrid paradigms in materials engineering.
Journal of Computational and Data-Driven Materials Engineering
Original Research | Open access | 18 March 2024 | Article: 108

Closed-World Training in an Open Materials Universe
In the rapidly evolving field of computational and data-driven materials engineering, machine learning models are increasingly trained on curated datasets that represent a closed-world approximation of material properties and behaviors. However, the broader materials universe encompasses vast, unexplored compositional spaces, dynamic environmental interactions, and emergent phenomena that defy static boundaries. This conceptual manuscript addresses the inherent tension between closed-world training paradigms—characterized by finite, labeled data regimes—and the open, infinite nature of materials discovery. We introduce a novel conceptual framework, termed the Adaptive Boundary Inference Architecture (ABIA), which integrates representation learning, uncertainty-aware feedback mechanisms, and multi-scale inference logics to navigate this disparity. ABIA conceptualizes training as a dynamic process where model boundaries adapt through iterative interactions between data representations and discovery pipelines, fostering resilience to out-of-distribution materials. By synthesizing recent advances in graph neural networks, foundation models, and autonomous systems, the framework highlights computational steering strategies that balance exploitation of known data with exploration of open spaces. Implications extend to enhanced inverse design, multimodal integration, and epistemic risk management in materials informatics, ultimately advancing sustainable and efficient materials engineering workflows. This work underscores the need for interpretive systems that transcend traditional closed-loop constraints, promoting a more holistic approach to data-driven discovery in an unbounded materials landscape.
Journal of Computational and Data-Driven Materials Engineering
Original Research | Open access | 18 March 2024 | Article: 109
Filters
Clear All





Access type