Rapid selection of milk mid-infrared spectra for creating a world representative spectral database of dairy cow population.
equation
mid-infrared
milk
Journal
Journal of dairy science
ISSN: 1525-3198
Titre abrégé: J Dairy Sci
Pays: United States
ID NLM: 2985126R
Informations de publication
Date de publication:
19 Jul 2024
19 Jul 2024
Historique:
received:
13
03
2024
accepted:
27
06
2024
medline:
22
7
2024
pubmed:
22
7
2024
entrez:
21
7
2024
Statut:
aheadofprint
Résumé
The advantage of employing mid-infrared spectrometry for milk analysis in breeding lies in its ability to quickly generate millions of records. However, these records may be biased if the calibration process does not account for their spectral variability when constructing the predictive model. So, this study introduces a novel method for developing a World Representative Spectral Database (WRSD) to reduce the risks of spectral extrapolation when predicting dairy traits in new samples. Utilizing a 2-phase selection procedure that is both efficient and minimizes memory usage, we first generate a decomposition matrix via Principal Component Analysis (PCA) on a data set of 2,324,443 records. The next phase iterates spectral selection based on a location index from PCA scores, calculating spectra occurrence frequency for refined barycenter estimations. The chosen spectra's barycenter closely aligns with the entire data set, proving the efficacy of using just 3 principal components (PCs). Applied to 4 varied data sets, totaling over 21 million records, we select 583,440 spectra to represent spectral diversity, with selection rates between 2.00% and 14.88%. This selection illustrates the spectral variability across different dairy populations and data providers. Demonstrated through a hypothetical calibration set of 71 samples, the WRSD's utility for algorithm developers becomes apparent. This calibration set covers between 91.42 to 98.50% of the WRSD variability, except for the Irish data set (3.50%), indicating a need for additional samples to accurately represent Irish variability and minimize spectral extrapolation. This study offers valuable insights into the representativeness of training sets for capturing spectral variability within targeted dairy populations. While the current WRSD does not fully encompass global milk spectral diversity, its development underscores the importance of gathering more data and standardizing spectral information across spectrometer brands. Ultimately, the WRSD proves crucial not just for trait prediction but also for identifying abnormal milk samples, also marking a significant relevance in dairy science technology.
Identifiants
pubmed: 39033920
pii: S0022-0302(24)01020-8
doi: 10.3168/jds.2024-24911
pii:
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
The Authors. Published by Elsevier Inc. on behalf of the American Dairy Science Association®. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).