Research Article Open Access

Partial Least Squares Regression Based Variables Selection for Water Level Predictions

Noraini Ibrahim1 and Antoni Wibowo1
  • 1 Department of Industrial Computing and Modelling Mathematics, Faculty of Computer Science and Information Systems, 81310, UTM Johor Bahru, Johor, Malaysia


Floods are common phenomenon in the state of Kuala Krai, specifically in Kelantan-Malaysia. Every year, floods affecting biodiversity on this region and also causing property loss of this residential area. The residents in Kelantan always suffered from floods since the water overflows to the areas adjoining to the rivers, lakes or dams. Months, average monthly rainfall, temperature, relative humidity and surface wind were used as predictors while the water level of Galas River was used as response. The selection of suitable predictor variables becomes an important issue for developing prediction model since the analysis data uses many variables from meteorological and hydrogical departments. In this study, we conduct K-fold Cross-Validation (CV) to select the important variables for the water level predictions. A suitable prediction model is needed to forecast the water level in Galas River by adopting the Ordinary Linear Regression (OLR) and Partial Least Squares Regression (PLSR). However, we need to perform pre-processing data of the datasets since the original data contain missing data. We perform two types of pre-processing data which are using mean of the corresponding months (type I pre-processing data) and OLR (type II pre-processing data) of missing data. Based on the experiment, PLSR is more suitable model rather than OLR for predicting the water level in Galas River and the use of the type I pre-processing data gives higher accuracy than the type II pre-processing data.

American Journal of Applied Sciences
Volume 10 No. 4, 2013, 322-330


Submitted On: 25 May 2012 Published On: 7 April 2013

How to Cite: Ibrahim, N. & Wibowo, A. (2013). Partial Least Squares Regression Based Variables Selection for Water Level Predictions. American Journal of Applied Sciences, 10(4), 322-330.

  • 5 Citations



  • Cross-Validation (CV)
  • Ordinary Linear Regression (OLR)
  • Partial Least Squares Regression (PLSR)
  • Galas River
  • Water Level