The integration of machine learning into materials engineering has transformed discovery pipelines by leveraging vast simulation-generated datasets and high-throughput computational workflows. Within this data-driven paradigm, models frequently incorporate simulation priors—implicit assumptions derived from physical approximations, boundary conditions, and discretization choices embedded in first-principles calculations or molecular dynamics trajectories. These priors, often hidden within representation learning and graph-based architectures, introduce epistemic biases that propagate through inference to downstream tasks such as inverse design and closed-loop experimentation. A key conceptual gap lies in the lack of systematic frameworks for articulating and managing these assumptions as integral components of the computational infrastructure rather than incidental data artifacts. This article introduces the Simulation Prior Articulation Framework (SPAF), an original systems-level conceptual structure that delineates layered processing of multimodal materials data, explicit prior extraction from simulation ecosystems, integration into deep learning architectures, and steering of discovery pipelines via feedback mechanisms. SPAF emphasizes representation–inference interactions, computational workflow dynamics, and infrastructure trade-offs to enhance simulation–experiment coupling without empirical benchmarking. By framing hidden physics assumptions as addressable epistemic structures, the framework provides integrative insights for materials informatics, foundation models, and autonomous discovery systems, supporting more transparent and robust data-driven materials engineering pipelines.