Institute for Advanced Materials Research Press Institute for Advanced Materials Research Press

Search

Search results:
Model Entropy and Scientific Information Loss in Compressed Representations of Materials
Compressed representations—such as handcrafted descriptors, autoencoder embeddings, and graph-neural-network latent spaces—have become indispensable in artificial-intelligence-driven materials science because they enable scalable property prediction from high-dimensional atomic configurations. Yet the very act of compression, while optimizing statistical correlation with target properties, systematically discards information whose scientific value lies outside mere predictive utility. This theoretical analysis applies information-theoretic principles from Shannon and Cover and Thomas to examine how dimensionality reduction in materials representations affects the retention of scientifically relevant content. Drawing on the concept of model entropy introduced by S. S., the paper introduces “model entropy” as a quantitative lens for assessing the information content preserved in any compressed materials representation. It articulates a core theoretical claim: compression optimized for predictive accuracy maximizes statistical information but can erode scientific information—mechanistic, causal, and counterfactual structures essential for understanding, explanation, and extrapolation. A typology of five distinct information-loss mechanisms is developed, each illustrated with representative materials-science scenarios. The analysis culminates in concrete implications for representation design and scientific inference, arguing that future materials AI must move beyond accuracy-centric evaluation toward explicit auditing and preservation of scientific information. By distinguishing statistical signal from epistemic content, this work offers a conceptual framework for building representations that serve both prediction and discovery without hidden epistemic costs.
Journal of Artificial Intelligence for Materials Science
Original Research | Open access | 18 January 2023 | Article: 112
Filters
Clear All





Access type