Prediction of Cadmium Adsorption Using a
Combined Machine Learning Model with a Multi-
Hybrid Input Approach: Advancing Artificial
Intelligence in Soil Contaminations

Wen-Qiang Wang; Pengjie Wang

+1 (929) 600-8049

- Feedback
- Signup
- Submit Manuscript

e-Pub

Full Text

COJ Robotics & Artificial Intelligence

Prediction of Cadmium Adsorption Using a Combined Machine Learning Model with a Multi- Hybrid Input Approach: Advancing Artificial Intelligence in Soil Contaminations

Wen-Qiang Wang* and Pengjie Wang

Department of Civil Engineering, Queen’s University, Canada

*Corresponding author: Wen-Qiang Wang, Beaty Water Research Centre, Department of Civil Engineering, 69 Union Street, Queen’s University, Kingston, K7L 3N6, Canada

Submission: December 11, 2024;Published: February 11, 2025

DOI: 10.31031/COJRA.2025.04.000587

ISSN:2832-4463
Volume4 Issue3

Abstract

Predicting cadmium (Cd(II)) adsorption in soils is critical for managing heavy metal contamination and mitigating its environmental risks. This study introduces a hybrid machine learning model that integrates Decision Trees (DT), Multi-Output Nonlinear Regression (MNLR), and Backpropagation Neural Networks (BPNN) to achieve accurate predictions of Cd(II) adsorption capacity. The model incorporates advanced data scaling techniques and feature expansion to effectively handle data heterogeneity and capture complex nonlinear relationships among soil properties, including Cation-Exchange Capacity (CEC), Organic Carbon Content (OC), clay content, pH, and soil-to-solution ratio. Sensitivity analysis identifies clay content as the most influential parameter, revealing its significant role in modulating adsorption behavior. The model demonstrates superior predictive performance, with an R² value of 0.898 and a substantial reduction in training loss, highlighting its potential for advancing environmental risk assessment and remediation strategies for contaminated soils. This work establishes a foundation for applying machine learning to optimize predictions in environmental science, offering insights into heavy metal adsorption and guiding the development of efficient remediation approaches.

Keywords:Machine learning; Heavy metal; Adsorption; Soil remediation; Decision tree; Neural network

Introduction

Heavy metals in soil pose significant environmental risks to both human health and animals. While soil adsorption can limit the long-range transport of heavy metals, the preservation of metal ions in local soils still substantially threat the nearby ecosystems [1,2]. Therefore, accurately estimating adsorption capacity is critical for assessing environmental risks and developing effective remediation strategies [1,3,4]. The adsorption capacity of heavy metals in soil depends on various factors, including pH, Cation-Exchange Capacity (CEC), clay content, and Organic Carbon Content (OC), all of which vary significantly across different soil types [4,5]. As such, estimating the adsorption capacity of a specific soil is challenging [6,7]. Traditional batch experiments are precise but time-consuming [1-8], while empirical isotherms like Freundlich and Langmuir models require experimental data and predefined variables [9-11], limiting their applicability in heterogeneous soils [12]. Advanced models like Surface Complexation Models (SCMs) rely on physical governing equations but are effective mainly for soils with well-defined surface functional groups [13].

Machine learning provides an innovative and powerful alternative, it excels in uncovering hidden relationships without extensive prerequisites [14]. This case report introduces a novel machine learning model with soil properties to predict heavy metal adsorption in soils. Cadmium (Cd) is selected as a case study due to its toxicity, carcinogenicity, and bioaccumulative effects [15]. This case study assessed the effectiveness of machine learning in predicting Cd(II) adsorption while also identifying the key soil properties that significantly influence adsorption behavior. The results of the machine learning model highlight its potential as a promising tool for providing insights into brown field redevelopment contaminated by heavy metals.

Case Presentation

Data pre-processing

The dataset in this study includes adsorption capacity of Cd(II) and various soil properties. Key input variables, selected for their strong correlation with heavy metal adsorption, include CEC, OC, clay content, equilibrium heavy metal concentrations, soil pH, solution pH, solution temperature, and soil-to-solution ratio. The dataset, compiled from open literature, consists of 1,093 soil samples from diverse Cd(II)-contaminated sites. Due to the wide magnitude range and extreme values present in these variables, data scaling was applied prior to model training to mitigate the impact of data heterogeneity. Kernel Density Estimation (KDE) with marginal histograms was used to ensure that scaling preserved the original data distribution. Additionally, data normalization was applied to smooth variable distributions, enhancing the model’s ability to interpret underlying relationship.

Framework of machine learning based model

The model applied in this study is a hybrid machine learning model adapted from previous work [16], it begins with a Decision Tree (DT) to classify soil types based on their inherent characteristics. The spitting algorithm of applied DT is given by Equations [1-4].

Where Sum_left and Sum_right represent the sums of all data points in each respective leaf that denotes as y_left and y_right (Equations (1) and (2)). The weights of each leaf, denoted as w_r and w₁ , were introduced to adjust the dataset in each leaf in (Equation (3)). The decision tree algorithm dynamically adjusts these weights to minimise the impurity until the specified criterion is reached (Equation (4)). Equation (1) to (4) uses a modified Friedman Mean Squared Error (MSE) as the splitting criterion [17], with parameters such as a predefined maximum depth and a minimum of 50 samples per leaf to prevent overfitting. Next, a Multi-Output Non-Linear Regression (MNLR) is applied, utilizing 6th-degree polynomial features to capture complex non-linear relationships. These features are expanded into 465 variables, which are combined with the original input values as a reference set, and processed using a random forest regressor. Finally, a Backpropagation Neural Network (BPNN) is employed to further refine the predictions. The BPNN in this study consists of an input layer, two hidden layers and an output layer to update the weights by optimization algorithm. The forward propagation of the Backpropagation Neural Network (BPNN) employs the Leaky ReLU activation function, as shown in Equation (5), to introduce non-linearities between the layer connection:

where x_i represents the input value from previous nodes and w_i is the corresponding weight in current layer. The application of the Leaky ReLU activation function enables the model to capture complex relationships in the data. The optimized weights are then determined using the Stochastic Gradient Descent (SGD) optimization algorithm to minimize the Mean Absolute Error (MAE) between the model’s predictions and the ground truths, over 10,000 training epochs [18]. The overall structure of the model is illustrated in (Figure 1).

Figure 1:Schematic representation of hybrid model. The input of BPNN include both expended variables from MNLRs and original input values.

Model performance

The DT-MNLR machine learning model demonstrated strong predictive performance for Cd(II) adsorption (Table S1, Supporting Information), with an R² value of 0.898, RMSE of 0.593, and Nash- Sutcliffe Efficiency (NSE) of 0.896, indicating a high degree of accuracy in the model’s predictions. In addition, the introduction of a highly non-linear relationship and the use of the original reference set significantly reduced the training loss, achieving a fourfold decrease from 1.25 to 0.25. The Figure 2 and regression statistics reveal a strong linear correlation between the predicted and true values, highlighting the model’s ability to accurately estimate adsorption capacity (Figure 2).

Table 1:Model’s predictions of adsorption capacity vs. ground truths..

Figure 2:Prediction of hybrid model compared with true value.

The further sensitivity analysis shows that clay content significantly influences the parameter variability. For the scenario with 20% clay, its effect is more pronounced compared to 5% clay content. Other parameters, such as OC and CEC, also exhibit noticeable variability with changes of ±10%. However, factors like solution pH and soil-to-solution ratio display relatively minor sensitivity in both scenarios. Overall, higher clay content amplifies the impact of various soil and solution parameters on the system. This result demonstrates that the significance of variables influencing adsorption capacity depends on specific conditions.

Conclusion

This study demonstrates the effectiveness of the DT-MNLR hybrid machine learning model in predicting cadmium (Cd(II)) adsorption in soils with high accuracy, as evidenced by the high R² value and significantly reduced error metrics. By integrating advanced data preprocessing, decision trees for classification, multi-output nonlinear regression for feature expansion, and backpropagation neural networks for refinement, the model achieves a robust predictive capability that can address the complexities of heterogeneous soil environments. The findings underscore the critical role of clay content, alongside other factors such as OC and CEC, in influencing adsorption behavior under varying conditions. These insights are not only vital for optimizing cadmium contamination assessments but also serve as a benchmark for employing machine learning in environmental remediation strategies.

The implications of this work extend beyond cadmium to broader applications in heavy metal contamination management. Future research could explore the model’s adaptability to other pollutants, such as chromium (Cr(II)), copper (Cu(II)), and lead (Pb(II)), while incorporating additional processes like biological uptake and redox transformations to enhance predictive accuracy. By bridging machine learning with environmental science, this study paves the way for more efficient, data-driven solutions to soil contamination challenges.

References

© 2025 Wen-Qiang Wang. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and build upon your work non-commercially.

Submit Query

PubMed Indexed Articles

Track Your Article

Editor In Chief

Hirotada TSUJII

Ph.D in Agriculture from Faculty of Agriculture, Tohoku University

Approaches in Poultry, Dairy & Veterinary Sciences

Maria Kuman

Research Professor, PhD, Holistic Research Institute

Advances in Complementary & Alternative Medicine

Tomasz Karski

MD PhD, Professor, Vincent Pol University

Orthopedic Research Online Journal

Jiexiong Feng

Professor, Chief Doctor, Director of Department of Pediatric Surgery, Associate Director of Department of Surgery, Doctoral Supervisor Tongji hospital, Tongji medical college, Huazhong University of Science and Technology

Research in Pediatrics & Neonatology

Muhammad Atiqullah

Senior Research Engineer and Professor, Center for Refining and Petrochemicals, Research Institute, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran, Saudi Arabia

Research & Development in Material Science

Ian James Martins

Fellow of International Agency for Standards and Ratings (IASR), Edith Cowan University, Sarich Neuroscience Research Institute

Advancements in Case Studies

Thomas F George

Chancellor Emeritus / Professor Emeritus of Chemistry and Physics, University of Missouri–St. Louis

Annals of Chemical Science Research

Jose Crisologo de Sales Silva

Ph.D in Science from the Federal University of Alagoas, UFAL, Brazil

Novel Research in Sciences

Naglaa Sami Adbel Aziz Mahmoud

Assistant Professor in College of Architecture, Art and Design

Academic Journal of Engineering Studies

Tong-Ching Tom Wu

Interim Dean, College of Education and Health Sciences, Director of Biomechanics Laboratory, Sport Science Innovation Program, Bridgewater State University

Research & Investigations in Sports Medicine

Dr. Jose Luis Turabian

Professor of numerous training courses in Family Medicine

Associative Journal of Health Sciences

Dariusz Jacek Jakóbczak

Assistant Professor, Department of Electronics and Computer Science

COJ Electronics & Communications

Önder Pekcan

Emeritus Professor of Physics, Kadir Has University, Turkey

Polymer Science: Peer Review Journal

Member In

View All...

Quick Links

Editorial Board Registrations

×

Join as Editor

Join as Associate Editor
Submit your Article
Best Paper of the Volume
Reprints
Refer a Friend

×

Refer a Friend

Suggested By

Referrer Details
Advertise With Us

×

Advertise With Us

Our Recent Edition

Top Editors

Zhengcai Lou

Wenzhou Medical University, China
Ya Lie Ku

Fooyin University, Taiwan
Volkan Sarper Erikci

Saglik Bilimleri University, Turkey
Tomasz Karski

Vincent Pol University, Poland
Thamil Selvam

National Defence University of Malaysia, Malaysia
Tarik Baykara

Dogus University, Turkey
Steven Smith

Hope College, USA
Stanislav Grigoriev

Russian Academy of Sciences, Russia
Shi Zhou

Southern Cross University, Australia
Shewikar Farrag

Umm Al-Qura University, Saudi Arabia
Ray Marks

City University of New York, USA
Praveen K Maghelal

Khalifa University of Science & Technology, United Arab Emirates
Pipat Chooto

Prince of Songkla University, Thailand
Peng Yu

Hebei Normal University, China
Nawal Mohamed Khalafallah

Alexandria University, Egypt
N K Kishore

Indian Institute of Technology Kharagpur, India
Muzzalupo Innocenzo

Council for Agriculture Research and Analysis of Agri Economy (CREA), Italy
Muhammad Atiqullah

King Fahd University of Petroleum and Minerals, Saudi Arabia
Mohd Azlan Mohd Ishak

Universiti Teknologi MARA, Malaysia
Mohamed A Rashed

King Abdulaziz University, Saudi Arabia
Maurice E Morgenstein

University of Oregon, USA
Martin Sweatman

University of Edinburgh, Scotland
Maria Kuman

University of Tennessee, USA
Manuel Velasco

Central University of Venezuela, Venezuela
Majid Monajjemi

Islamic Azad University Central Tehran Branch, Iran
Luisetto Mauro

Tourin University, Italy
Lloyd Arthur Jenkins

Teaching & Public Speaking, Spain
Leonardo Milella

Paeditric Hospital "Giovanni XXIII", Italy
Katerina Chryssou

General Chemical State Laboratory , Greece
Kanakis Dimitrios

University of Nicosia, Cyprus
Jose Luis Clua Espuny

Universidad Miguel Hernández de Elche, Spain
John Korstad

Oral Roberts University, USA
Jinliang Zhang

Beijing Normal University, China
Irina Koretsky

Howard University, USA
Ian James Martins

Edith Cowan University, Australia
Hamid Yahiya Hussain

Dubai Health Authority, UAE
Gundu HR Rao

University of Minnesota, USA
GP Karmakar

Indian Institute of Technology Kharagpur, India
Ghassan George Haddad

Serhal Hospital, Lebanon
George Thomas

University of Missouri-St. Louis , USA
George Gregory Buttigieg

University of Malta, Malta
Fumihiko Hinoshita

National Center for Global Health and Medicine, Japan
Freida Pemberton

Molloy College, USA
Francisco Welington de Sousa Lima

Federal University of Piauí, Brazil
Florian Bert

Krankenhaus Nordwest Hospital, Germany
Fedor Lisetskii

Belgorod State University, Russia
Fathi Habashi

Laval University, Canada
Dora Alicia Cortes Hernandez

Cinvestav-Unidad Saltillo, Mexico
Daniel Kinem

UPMC Hamot Neuroscience Institute, USA
Conxita Mestres Miralles

Ramon Llull University, Spain
Barry Kraynack

White Bear Associates, LLC, USA
Arkady S Voloshin

Lehigh University, USA
Alireza Heidari

California Southern University, USA
Alex Guskov

Institute of Solid State Physics of RAS, Russia
Alan Diego Briem Stamm

University of Buenos Aires, Argentina
Ahmed Nasr Ghanem

Mansoura University, Egypt
Afaf K El Ansary

King Saud University, Saudi Arabia
A Bernardes

University of Coimbra, Portugal

Financial Support

Latest e-Books

Latest Video

© 2017 Crimson Publishers, All rights reserved. No part of this content may be reproduced or transmitted in any form or by any means as per the standard guidelines of fair use. Creative Commons License Open Access by Crimson Publishers is licensed under

a Creative Commons Attribution 4.0 International License. Based on a work at www.crimsonpublishers.com. Best viewed in

| Above IE 9.0 version

Scroll