Deep Learning is being widely utilized in industrial process monitoring, control, and optimization. However, in the wastewater industry, its applications are still under-explored. This is because deep learning requires a large amount of labeled training data to induce effective predictive models. Due to the high cost of sensors and frequency and delay in sampling and laboratory analytics, wastewater treatment process data can be sparse with varying frequencies. One option to address training data limitations is to use transfer learning. However, due to the large covariate shift between the commonly adopted source domains for transfer learning, and the target domain of wastewater processes, this approach leads to unacceptable performance. We address this issue by proposing a novel synthetic data generation method for deep predictive modeling of wastewater plants. Employing a Markov process that utilizes random walk, our technique enables the generation of abundant annotated data for our target domain. The method preserves the temporal dynamics and distribution of the original data, thereby closely mimicking the potential original samples of the domain. We extensively evaluate our method over two different high rate algae based treatment data sets, demonstrating considerable performance gains over existing transfer learning. Our proposed algorithm can assist plant operators to deploy responsive supportive models with limited data.