Data gap filling using cloud-based distributed markov chain cellular automata framework for land use and land cover change analysis: Inner Mongolia as a case study
Geography and Geology
With advances in remote sensing, massive amounts of remotely sensed data can be harnessed to support land use/land cover (LULC) change studies over larger scales and longer terms. However, a big challenge is missing data as a result of poor weather conditions and possible sensor malfunctions during image data collection. In this study, cloud-based and open source distributed frameworks that used Apache Spark and Apache Giraph were used to build an integrated infrastructure to fill data gaps within a large-area LULC dataset. Data mining techniques (k-medoids clustering and quadratic discriminant analysis) were applied to facilitate sub-space analyses. Ancillary environmental and socioeconomic conditions were integrated to support localized model training. Multi-temporal transition probability matrices were deployed in a graph-based Markov–cellular automata simulator to fill in missing data. A comprehensive dataset for Inner Mongolia, China, from 2000 to 2016 was used to assess the feasibility, accuracy, and performance of this gap-filling approach. The result is a cloud-based distributed Markov–cellular automata framework that exploits the scalability and high performance of cloud computing while also achieving high accuracy when filling data gaps common in longer-term LULC studies.
Link to Published Version
Lan, H., Stewart, K., Sha, Z., Xie, Y., & Chang, S. (2022). Data gap filling using cloud-based distributed markov chain cellular automata framework for land use and land cover change analysis: Inner Mongolia as a case study. Remote Sensing, 14(3), 445. https://doi.org/10.3390/rs14030445