BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//CERN//INDICO//EN
BEGIN:VEVENT
SUMMARY:Hybrid Stacking and Embedded Regression with Multi-Phase Feature S
 election for Explainable Crop Yield Prediction in Botswana
DTSTART;VALUE=DATE-TIME:20251203T133500Z
DTEND;VALUE=DATE-TIME:20251203T135000Z
DTSTAMP;VALUE=DATE-TIME:20260609T202037Z
UID:indico-contribution-2959@events.chpc.ac.za
DESCRIPTION:Speakers: Kalu Ubi Kalu (University of Botswana)\, George  And
 erson (Co-Author)\, Audrey  Masizana (University of Botswana)\nHybrid Stac
 king and Embedded Regression with Multi-Phase Feature Selection for Explai
 nable Crop Yield Prediction in Botswana \nAbstract \nIn Sub-Saharan Africa
 's climate instability\, inaccurate data\, and lack of precision agricultu
 ral tools make it extremely difficult to predict crop yields with any degr
 ee of accuracy. These restrictions are especially critical in Botswana\, w
 here most agricultural activities are rain-fed and highly vulnerable to en
 vironmental changes. To provide accurate\, comprehensible\, and context-sp
 ecific yield predictions for four staple crops: Maize\, Millet\, Pulses\, 
 and Sorghum. This study uses a hybrid machine learning approach. The appro
 ach integrates multiple regression algorithms: Random Forest\, XGBoost\, S
 upport Vector Regression\, and Multi-Layer Perceptron within a stacked ens
 emble architecture tailored to Botswana’s agricultural data context. To 
 optimize predictive power and interpretability\, a multi-phase feature sel
 ection strategy was applied\, combining entropy filtering\, mutual informa
 tion\, recursive feature elimination (RFE)\, and engineered temporal featu
 res through lag variables. \nThis process refined input variables for both
  the staging models and region-specific selection\, ensuring robust model 
 generalization. Model performance was evaluated using historical yield\, m
 eteorological\, and soil datasets\, with R²\, RMSE\, and MAE employed as 
 metrics. The Stacking Hybrid Regression Model performed exceptionally well
  in yield prediction for pulses and sorghum\, achieving the best performan
 ce with R2 = 0.94\, RMSE = 0.60 t/ha\, and MAE = 0.32 t/ha. The most signi
 ficant predictors were rainfall\, temperature fluctuation\, and lagged yie
 ld values\, according to a unified interpretability framework that was pro
 duced by combining SHapley Additive exPlanations (SHAP) with entropy analy
 sis. Surprisingly\, entropy research showed that Sorghum had a greater pre
 dictor complexity and shown the ability to adjust to unpredictable weather
 . Time-horizon stability of the model was confirmed by forward simulations
  for 2025–2028. \nThese results confirm that interpretable hybrid ensemb
 les can satisfy precision agriculture's accuracy and transparency requirem
 ents when reinforced by multi-phase feature selection. The suggested appro
 ach supports climate risk management tactics for Botswana's farmers by pro
 viding useful information for early-season production projection and input
  distribution. Additionally\, other sub-Saharan regions with comparable en
 vironmental and data-related constraints may find the methodology applicab
 le. \nKeywords: predictive crop yield\, precision agriculture\, Botswana\,
  XAI\, multi-phase feature selection\, hybrid ensemble models\, and SHAP.\
 n\nhttps://events.chpc.ac.za/event/155/contributions/2959/
LOCATION:Century City Conference Centre 1/1-7 - Room 7
URL:https://events.chpc.ac.za/event/155/contributions/2959/
END:VEVENT
END:VCALENDAR
