Forecasting Global Horizontal Irradiance Using Deep Learning
ResearchFeatured

Forecasting Global Horizontal Irradiance Using Deep Learning

Deep learning framework for forecasting Global Horizontal Irradiance in Ho Chi Minh City using satellite-derived data and state-of-the-art time series models.

Overview

Developed a comprehensive deep learning framework for forecasting Global Horizontal Irradiance (GHI) in Ho Chi Minh City, Vietnam. This research systematically compared 10 neural network architectures—from traditional models (LSTM, 1D-CNN, CNN-LSTM, MLP, TCN) to advanced architectures (Transformer, Informer, TSMixer, iTransformer, Mamba)—using a decade of satellite-derived data from the National Solar Radiation Database (NSRDB).

Problem Context

Accurate solar irradiance forecasting is critical for:

  • Grid integration of solar power and energy trading
  • Optimizing solar panel operations and scheduling
  • Supporting Vietnam's renewable energy transition

Vietnam's tropical climate presents unique forecasting challenges due to rapid weather changes and monsoon seasons, making robust prediction models essential.

Data Pipeline

Data Source & Processing

  • Source: National Solar Radiation Database (NSRDB) Himawari-7 satellite data
  • Period: 10 years of hourly measurements (2011–2020)
  • Scale: 87,600 timesteps across 105 grid cells covering Ho Chi Minh City
  • Split: Training (2011–2018), Validation (2019), Test (2020)

Feature Engineering

  • Target: Global Horizontal Irradiance (GHI, 9–829 W/m²)
  • Atmospheric: DNI, DHI, AOD, cloud type, cloud effective radius, surface albedo
  • Temporal: Cyclical encodings (hour, day, month, day-of-week using sin/cos)
  • Computed: Clearsky GHI reference, nighttime mask based on solar zenith angle

Key Results

Comprehensive Model Comparison

ModelMSE ↓RMSE ↓MAE (W/m²) ↓R² ↑Speed (samples/sec) ↑
Transformer2816.7753.0724.260.9696239,871
Informer2846.8653.3624.900.9692117,882
TSMixer2848.6153.3725.660.969288,357
TCN2856.4853.4525.320.9691644,131
LSTM2859.2253.4726.870.9691215,547
iTransformer2869.8153.5725.620.9690272,867
Mamba3006.0554.8325.840.9675193,084
MLP3165.8956.2727.840.96585,642,588
CNN-LSTM3274.1257.2229.810.9646310,191
1D-CNN3549.0359.5732.440.9617996,542

Key Findings:

  • Transformer outperforms all models, reducing MSE by ~1.4% vs the best basic model (TCN)
  • TCN offers a highly competitive alternative with 2.7× higher throughput than Transformer
  • MLP achieves exceptional speed (5.6M samples/sec) for resource-constrained deployments

Model Compression

Applied three compression techniques to the best-performing Transformer model:

MethodSize (MB)ReductionLatency (ms)MAE (W/m²)
*Baseline*1.073792.1324.26
Int8 (CPU)0.4464.0%519.8825.24
FP16 (GPU)0.6546.0%22.0224.25
Pruning (50%)1.070.0%3857.31176.21
Distilled Student0.8223.5%3081.4623.78

Knowledge Distillation was most effective—the distilled student outperformed the teacher (MAE: 23.78 vs 24.26 W/m²) while reducing size by 23.5% and latency by ~19%. FP16 Quantization achieved 47% latency reduction with near-identical accuracy. Structured pruning was ineffective for this architecture.

Explainability with SHAP

SHAP analysis revealed distinct temporal attention patterns:

  • Mamba: U-shaped importance—values both distant past (t-23) and recent (t-0) timesteps
  • Transformer: Extreme recency bias—nearly all importance at t-0

Key Predictors

  1. clearsky_ghi: Strongest predictor for GHI estimation
  2. Temporal features: hour_cos, hour_sin dominate at recent timesteps
  3. Cloud conditions: cloud_type, surface_albedo significantly influence predictions
  4. Minimal impact: wind_speed, day-of-week features (could be omitted to streamline models)

Conclusion

This research demonstrates that Transformer and TCN architectures excel at short-term GHI forecasting. Knowledge distillation enables efficient deployment without sacrificing accuracy, supporting sustainable AI practices. The explainability analysis enhances model interpretability, enabling stakeholders to understand key drivers of solar irradiance variability for informed energy planning.

Key Achievements

  • Achieved state-of-the-art accuracy with Transformer model (MSE: 2816.77, MAE: 24.26 W/m², R²: 0.9696)
  • Reduced model size by 23.46% and latency by 18.74% via knowledge distillation while improving accuracy
  • Processed a decade of hourly data (87,600 timesteps × 105 grid cells) from NSRDB satellite data
  • Used SHAP analysis to identify clearsky_ghi and temporal features as key predictors

Technologies Used

PyTorchTransformerTCNMambaInformeriTransformerTSMixerSHAPONNX

Skills Applied

Time Series ForecastingExplainable AIModel CompressionKnowledge Distillation

Project Details

  • Role

    Researcher

  • Team Size

    Individual

  • Duration

    Feb 2025 - May 2025

Want to know more about this project?

Ask my AI