PCA Risk Landscape
RO
Rendering PCA landscape...
Figure 1: Principal Component Analysis of Risk Metrics.
Each observation is projected onto two principal components derived from nine risk features
(MRS, SRS, BVI, and six Hopfield detector scores). Points are coloured by risk profile classification
(Q75-per-language activation thresholds). Point radius encodes the number of Hopfield detectors
that flagged the observation as anomalous (0–6). Clusters with high spatial coherence and elevated
Hopfield counts indicate coordinated manipulation signatures that persist across multiple independent
detection channels.
PCA computed via power iteration on the covariance matrix of z-normalised features.
Ranked Observations (Top 50)
| # | Article | Lang | Score | MRS | SRS | BVI | Profile | Hopfield | Detectors |
|---|
Risk Profile Distribution
Cross-Article Editor Network
RO
Figure 2: Cross-Article Contributor Network.
Nodes represent Wikipedia articles within the corpus, sized proportionally to their combined risk score
and coloured by risk profile type. Edges connect articles that share at least one common editor among
their top five contributors (as identified through MediaWiki API
usercontribs queries).
Edge colour distinguishes individual editors; thicker edges indicate editors who have been flagged by
Wikipedia Sockpuppet Investigation (SPI) process. This network reveals coordination
structures that are fundamentally invisible from CSV-derived features alone — an editor
systematically editing multiple politically sensitive articles may indicate either legitimate topical
expertise or coordinated information operations. The graph layout uses a force-directed algorithm
(Fruchterman-Reingold via D3.js) with charge repulsion calibrated to corpus size.
Network constructed from MediaWiki API enrichment data. Isolated nodes (no shared editors) are omitted for clarity.
Cross-Article Editors
| Editor | Count | Articles |
|---|
Detector Activation Rates
Anomaly Count Distribution
Anomaly Detection Heatmap (Top 100 Flagged Observations)
RO
Sort by:
Language:
Figure 3: Hopfield Anomaly Detection Matrix.
Each row represents an observation (article x month), and each column represents one of six
binary Hopfield detectors: Temporal, Network, Contributor, Manipulation, Epistemic, and Volatility.
Cells are coloured by detector energy score (darker = higher anomaly energy). The matrix is sorted
by descending total anomaly count, then by combined risk score. Multi-detector convergence
— where 2+ independent detectors flag the same observation — provides substantially higher confidence
than any single detector, as it indicates that anomalous patterns manifest simultaneously across
orthogonal feature spaces. The Hopfield network energy-based formulation ensures that stored
normal patterns act as attractors: observations that settle into high-energy states (far from
any attractor) exhibit behaviour inconsistent with the learned baseline.
Hopfield detectors use the Storkey learning rule with feature-specific thresholds (temporal: 0.3, network: 0.25, account: 0.4).
Engine Profiles by Cluster
RO
Figure 4: Engine Score Profiles by Risk Cluster.
Each radar chart shows the mean engine scores (Temporal, Network, Contributor, Manipulation,
Cross-Language Synchrony) for observations grouped by their k-means cluster assignment.
Cluster profiles reveal qualitatively distinct manipulation strategies: some clusters exhibit
elevated temporal scores (suggesting coordinated editing bursts), while others show network
or contributor anomalies (suggesting sockpuppet or single-purpose account activity).
The API enrichment layer contributes additional sub-signals to each engine — notably
cross-article editor overlap (Network engine) and single-purpose
account scoring (Contributor engine) — which were previously unavailable from
CSV-derived features.
Radar values are mean engine scores per cluster, scaled to [0, 1]. Cluster assignment via k-means (k=5) on z-normalised feature matrix.
API Enrichment Impact
Governance Asymmetry
Interpretation notes.
The governance asymmetry analysis identifies articles where Wikipedia's protective mechanisms
(semi-protection, extended-confirmed protection, full protection) were applied with substantial
delay relative to the onset of anomalous editing patterns. A governance lag
exceeding 30 days suggests that the article was exposed to sustained manipulation before
administrative intervention. The enrichment layer provides actual protection log timestamps
from the MediaWiki API, replacing the binary proxy used in the CSV-only baseline.
Articles with high manipulation scores and absent governance flags represent the
highest-risk category: active manipulation with no administrative response.
Feature Correlation Matrix
RO
Figure 5: Pearson Correlation Matrix of Risk Features.
The matrix reveals a three-cluster structure in the nine-dimensional feature space,
validated across both the Moldova (MD) and Romania (RO) corpora.
1. Q75 Orthogonality.
The three composite scores — MRS, SRS, BVI — are near-orthogonal to each other.
MRS x SRS: r = 0.10 (RO: −0.10); MRS x BVI: r = −0.02 (RO: −0.02).
This confirms that each composite captures a genuinely independent risk dimension:
manipulation risk, sourcing quality, and bias vulnerability are not redundant.
2. Behavioural Cluster.
Network, Contributor, and Manipulation detectors form a tightly correlated block:
Network x Manipulation: r = 0.86 (RO: 0.93);
Contributor x Manipulation: r = 0.79 (RO: 0.91);
Network x Contributor: r = 0.71 (RO: 0.81).
This triad represents coordinated editing behaviour — when one detector fires, the others
tend to fire simultaneously, suggesting a common underlying signal of organised activity.
3. Temporal-Volatility Cluster.
Temporal x Volatility: r = 0.68 (RO: 0.92).
These two detectors share a distinct time-series signal. The much stronger coupling in the
RO corpus (0.92 vs 0.68) suggests that Romanian Wikipedia edit wars produce tighter
burst-volatility co-occurrence than Moldovan articles.
4. Epistemic Isolation.
The Epistemic detector shows weak connections to all other features: the strongest is
SRS x Epistemic at r = 0.27 (RO: 0.39). Notably, Epistemic flips sign between corpora
for several pairs, indicating that sourcing-quality anomalies manifest differently in
MD vs RO Wikipedia ecosystems.
5. BVI Independence.
BVI shows no meaningful correlation with any Hopfield detector (all |r| < 0.07 in MD,
< 0.14 in RO). The Bias Vulnerability Index is entirely invisible to the energy-based
anomaly detection — it captures a structural article property (framing, balance) that
leaves no trace in editing-behaviour features.
6. MRS-Hopfield Coupling (Cross-Corpus Divergence).
In MD, MRS shows moderate correlations with behavioural detectors (0.39–0.44).
In RO, the same pairs reach 0.72–0.75 — nearly double. This suggests that
manipulation risk in the Romanian corpus is more tightly coupled with detectable
behavioural anomalies, while Moldovan manipulation may operate through subtler channels.
Pearson correlations computed on raw (un-normalised) feature values. MD values shown first; RO values in parentheses where they differ substantially.