Interpretation of CPTu Data Using Machine Learning Techniques to Develop the Ground Model of a Dam

Building a ground model through manual processes can be time consuming, as large amounts of data need to be classified to define the extent and spatial distribution of the different soil materials. 

This paper delves into the application of machine learning (ML) methodologies, in conjunction with in-situ geotechnical testing data, to develop the ground model for a downstream dam founded on both weak and liquefiable soils. 

The dam covers a linear extent of approximately 800 m and was extensively characterized by means of in-situ tests, including 206 cone penetration tests (CPTu), 37 boreholes and 35 test pits. The performance of two unsupervised ML clustering algorithms are compared: Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and an extended version with a hierarchical component (HDBSCAN). 

The clustering uses CPTu data, which consists of the normalized cone tip resistance (Qtn) and the normalized sleeve friction (Fr) varying with elevation. Nearby borehole logs are used to evaluate the results of both clustering methods for a single CPTu sounding using different clustering parameters. 

Then, a global clustering including several CPTu soundings is done and results are compared with the ground model that was manually made using Leapfrog software. Both methods show very good performance, with HDBSCAN being better and more robust.