Data source(s): from the corn yield prediction model and USDA data.
Corn and soybean yield prediction provides valuable information about production and prices prior to harvest. Publicly available high-quality yield prediction can help address emergent information asymmetry problems and, in doing so, improve price efficiency in futures markets.
The corn yield prediction is based on Jiang et al. (2018), in which authors developed a Long Short-Term Memory (LSTM) model to predict corn yields in ten corn belt states—Illinois, Indiana, Iowa, Kansas, Michigan, Missouri, Minnesota, Nebraska, Ohio, and South Dakota—that achieved promising results with the sample data. Overall, the model prediction is only 0.83 bushel per acre (bpa) lower than actual corn yields, a smaller difference than the corresponding prediction from USDA. About 80% of the LSTM county-level corn yield predictions fall within +/-20 bpa of actual yields. The model uses ten variables that affect corn yields, which were constructed from data sets provided by the USDA, USDA-NRCS, USDA-NASS, NOAA, and IBM weather underground.
The soybean yield prediction is based on Xiong and Ji (2020), in which they extended the deep learning framework to the soybean yields prediction in 11 corn belt states (Wisconsin is the 11th state.). The root mean square error measure of county soybean yield from the best specification is around 5.3 bpa in the study area. The performance is comparable to several recent soybean yield prediction models.
CARD graphic team prepares the daily yield prediction and maps based on the November yield prediction model from the Jiang et al. (2018) and Xiong and Ji (2020). Both work relies on realized weather variables from the corn growing season (April 1 to October 31). Pre-November in-season corn/soybean yield prediction relies on non-realized weather variables and uses historical weather data from 1981 to 2019 to complete the input data set. The median prediction is chosen for the reporting purpose.
Maps are prepared with the daily yield predicition data and historical data queried from USDA Quick Stats.
The yield difference in this map is defined as the predicted yield minus the trend yield in 2020. The trend yield is calculated with the historical county yields queried from USDA Quick Stats. First, we regress the logarithm (corn) of yields or yields (soybean) on the year variable along with a set of county dummies. Second, 2020 trend yields are predicted with estimates from the regression in the first step.
The harvested acreage difference is defined as the imputed county harvested acreage minus the harvested acreage in 2019 queried from USDA Quick Stats. The acreage information is from the annual Acreage report. For 2020, the report release date is Jun 28, 2020. [the link] The county harvested acreage in 2020 is imputed based on the correlation of county planted/harvested acreage and state planted/harvested acreage statistics in 2019.
Note: With the county information in Link 1 and 2, we can calculate county share of planted acreage in a state, and the county harvested acreage share over total planted acreage. Together with the reported state planted acreage in Link 3, we can compute county harvested acreage of corn/soybean in 2019.
The production in 2020 is computed as the predicted corn yield times the imputed harvested acreage explained above.
The majority of counties out of about 1,000 counties in the 11(10) Corn Belt states are explicitly modeled either in Jiang et al. (2018). or in Xiong and Ji (2020). Our daily prediction model produces yield predictions for modelled counties. These predictions are weighted by imputed harvested acreage to get the yield for the Corn Belt counties with individually reported acreage numbers in 2019 (USDA Quick Stats). In 2019, more than 600 counties have inidivdiually reported acreage numbers, and in total these counties represent 78.9% (Corn) and 77.1% (Soybean) of the total harvested acreage in cornbelt states, 60.7% (Corn) and 58.2% (Soybean) of national total. Their production represents 81.2% (Corn) and 77.7% (Soybean) of the total production in cornbelt states, and 64.4% (Corn) and 61.4% (Soybean) of the national total production. In 2019, the average yield for these 600 plus counties is 178.49 bpa for corn and 50.12 bpa for soybean, the corresponding national yield is 168.23 bpa for corn and 47.46 bpa for soybean.
Once the Corn Belt yield is obtained, the daily national yield is imputed by the formula: national yield in 2019 * Corn Belt prediction / cornbelt yield in 2019. The implicit assumption is that all the yields are adjusted by the same percentage change as in Corn Belt case. For the national production, we simply multiply the imputed national yield with an estimate of harvtested acreage imputed from 2019 annual Acreage report.