High-Resolution Soybean Yield Mapping Across the US Midwest Using Sub-field Harvester Data
Abstract/Contents
- Abstract
- Cloud computing and freely available, high-resolution satellite data has enabled recent progress in crop yield mapping at fine scales. However, extensive validation data at a matching resolution remains uncommon or infeasible in much of the world. Here, we use a large-scale ground-truth dataset across the United States Midwest to assess machine learning models’ capacity for soybean yield prediction. First, we compare random forest (RF) implementations across 400,000 fields, testing a range of feature engineering approaches using Sentinel-2 and Landsat spectral data for 20- and 30-meter scale yield prediction. We find that Sentinel-2-based models can explain up to 45% of out-of-sample yield variability across 2017-18, while Landsat models explain up to 43% across the longer 2008-18 period. Using discrete Fourier transforms, or harmonic regressions, proved helpful for capturing soybean phenology, improving a Landsat-based model considerably. Second, we compare RF models trained using our fine-scale harvester data to models trained on freely available county-level data. We find that county-level models rely more heavily on just a few predictors, namely August weather covariates (VPD, rainfall, temperature) and July and August NIR observations. As a result, county-scale models perform relatively poorly on field-scale validation, especially for high-yielding fields, but perform similarly to field-scale models when evaluated at the county scale. Finally, we test whether our findings on variable importance can inform improvements to a Scalable Crop Yield Mapper (SCYM) approach that uses crop simulations to train models for yield estimation. Based on findings from our RF models, we employ harmonic regressions to estimate peak VI and a VI observation 30 days later, with August rainfall as a sole weather covariate in our new SCYM model. These changes proved effective for improving SCYM’s explained variance and creating a simple, generalizable framework for regions or time periods beyond which ground data are available.
Description
Type of resource | text |
---|---|
Date created | June 1, 2020 |
Creators/Contributors
Author | Dado, Walter T |
---|---|
Contributing author | Deines, Jillian M |
Contributing author | Patel, Rinkal |
Contributing author | Liang, Sang-Zi |
Primary advisor | Lobell, David B |
Subjects
Subject | School of Earth Energy & Environmental Sciences |
---|---|
Subject | Crop Yield Mapping |
Subject | Remote Sensing |
Subject | Machine Learning |
Genre | Thesis |
Bibliographic information
Access conditions
- Use and reproduction
- User agrees that, where applicable, content will not be used to identify or to otherwise infringe the privacy or confidentiality rights of individuals. Content distributed via the Stanford Digital Repository may be subject to additional license and use restrictions applied by the depositor.
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Preferred citation
- Preferred Citation
- Dado, Walter T and Deines, Jillian M and Patel, Rinkal and Liang, Sang-Zi and David B Lobell. (2020). High-Resolution Soybean Yield Mapping Across the US Midwest Using Sub-field Harvester Data. Stanford Digital Repository. Available at: https://purl.stanford.edu/gj825fq6518
Collection
Master's Theses, Doerr School of Sustainability
View other items in this collection in SearchWorksContact information
- Contact
- tekedado@icloud.com
Also listed in
Loading usage metrics...