🎓 M.Sc (Tech) Thesis · 2025

Machine Learning Approach For Earthquake Declustering

Intelligent Classification of Seismic Events in New Zealand

👤
Researcher Md Ashraf
🏛️
Institution IIT (ISM) Dhanbad
🧑‍🏫
Supervisor Dr. Niptika Jana
97.44%
Accuracy
396K+
Events
44 Years
Data
Scroll to explore

Understanding Earthquake Declustering

Separating independent earthquakes from dependent aftershocks

🌍

What is Declustering?

Earthquake catalogues contain background events (independent, random tectonic stress release) and dependent events (aftershocks triggered by previous earthquakes). Declustering separates these populations to enable accurate hazard assessment and earthquake forecasting.

Background seismicity follows a Poisson process, while clustered events exhibit strong spatiotemporal dependencies.

📏

Fixed Parameters

Traditional methods use predetermined windows that don't adapt to regional variations.

🔀

Overlapping Clusters

Complex sequences overlap, making separation difficult with rule-based approaches.

🎯

Subjective Tuning

Results depend heavily on threshold choices, reducing reproducibility.

📊

Statistical Bias

Model assumptions often violated by real-world catalogue incompleteness.

🤖

Machine Learning Solution

01
Pattern Recognition

Automatically learns complex nonlinear relationships without fixed rules

02
Adaptive Learning

Generalizes across regions without manual parameter tuning

03
Multidimensional Analysis

Captures time-space-magnitude relationships simultaneously

04
Objective Classification

Reproducible results based on learned patterns from synthetic training data

Research Framework

Six-stage machine learning pipeline

1

Data Acquisition

GeoNet earthquake catalogue (1980-2024) · 396,267 events · Quality control and preprocessing

2

NND Analysis

Nearest-neighbour distance computation · Rescaled space-time metrics · Bimodal distribution analysis

3

ETAS Simulation

Synthetic catalogue generation · Labeled background/triggered events · Training data preparation

4

Feature Engineering

Extract T⁺, R⁺, dm⁺, N⁺, n_parent, n_child · Temporal, spatial, magnitude features

5

Model Training

Random Forest, SVM, Gradient Boosting, XGBoost · 5-fold cross-validation · Hyperparameter optimization

6

Deployment

Apply best model (XGBoost) to real catalogue · Validate against historical sequences

ML Models

RF Random Forest
SVM Support Vector
GB Gradient Boost
XGB XGBoost ⭐

Features

T⁺ R⁺ dm⁺ N⁺ n_parent n_child

Research Plots & Analysis

Key visualizations from the study

Model Performance Comparison

Figure 1
Model Performance Comparison

Accuracy, Precision, Recall, and F1-Score comparison across Random Forest, SVM, Gradient Boosting, and XGBoost models.

Feature Importance Ranking

Figure 2
Feature Importance

XGBoost feature importance showing N⁺ (siblings count) as the most influential predictor, followed by R⁺ (rescaled distance) and T⁺ (rescaled time).

Classification Distribution

Figure 3
Classification Distribution

Distribution of background events (58.23%) vs triggered events (41.77%) from XGBoost classification of New Zealand catalogue.

Temporal Evolution of Seismicity

Figure 4
Temporal Evolution

Background and triggered events over 44 years (1980-2024) showing major earthquake sequences: Edgecumbe (1987), Canterbury (2010-2011), Kaikōura (2016).

Confusion Matrix

Figure 5
Confusion Matrix

XGBoost confusion matrix on synthetic test data showing 98.7% True Positive rate and 94.4% True Negative rate.

Spatial Distribution Map

Figure 6
Spatial Distribution

Spatial distribution of background and triggered events across New Zealand, showing alignment with major tectonic structures.

NND Distribution

Figure 7
NND Distribution

Bimodal distribution of nearest-neighbour distance (log η) showing clear separation between background and clustered events.

ROC Curve Analysis

Figure 8
ROC Curve

Receiver Operating Characteristic curve demonstrating model discrimination capability with AUC = 0.98.

Detailed Performance Metrics

Model Accuracy Precision Recall F1-Score
XGBoost 97.44% 97.66% 98.74% 98.20%
Gradient Boosting 97.11% 97.06% 98.89% 97.97%
Random Forest 96.72% 96.22% 95.15% 97.91%
SVM 94.36% 94.48% 94.36% 94.40%

Key Findings

Application to New Zealand earthquake catalogue

🎯
97.44%
Classification Accuracy
XGBoost model on synthetic test data
🟢
230,758
Background Events
58.23% of total catalogue
🔴
165,509
Triggered Events
41.77% aftershocks identified

Feature Importance

1
N⁺ — Siblings Count

Most influential predictor of aftershock clustering

2
R⁺ — Rescaled Distance

Spatial proximity strongly indicates triggering

3
T⁺ — Rescaled Time

Temporal correlation with Omori-Utsu decay

Comparison with Traditional Methods

Gardner-Knopoff
75% BG
NND Threshold
62% BG
XGBoost (This Study)
58.23% BG
<<<<<<< HEAD

Resources