MDF streamlines and automates data sharing, discovery, access and analysis by: 1) enabling data publication, regardless of data size, type, and location; 2) automating metadata extraction from submitted data into MDF metadata records (i.e., JSON formatted documents following the MDF schema) using open-source materials-aware extraction pipelines and ingest pipelines; and 3) unifying search across many materials data sources, including both MDF and other repositories with potentially different vocabularies and schemas. Currently, MDF stores 60 TB of data from simulation and experiment, and also indexes hundreds of datasets contained in external repositories, with millions of individual MDF metadata records created from these datasets to aid fine-grained discovery.