Supervised Learning

NeuroArcane
Nov 22, 2025
3 min read

Supervised Learning at Neuroarcane

Supervised learning offers a powerful lens through which Neuroarcane interprets, forecasts, and classifies patterns in global internet behavior. Internet interference—whether caused by throttling, DNS tampering, protocol blocking, or targeted shutdowns—manifests through measurable distortions in the underlying traffic.

Datasets such as OONI, IODA, CAIDA Ark, and RIPE Atlas provide continuous measurements of latency, blocking signatures, and DNS response anomalies across networks around the world. In this blog, we focus on supervised learning and demonstrate how regression and classification each play distinct roles in assessing and predicting adversarial manipulation of internet traffic.

Our goal is to demonstrate how supervised learning methods help Neuroarcane detect and anticipate network interference. To do this, we examine two specific use cases grounded in real-world measurable behavior.

Regression

Regression models are effective when we need to quantify how much interference is happening or forecast when interference is likely to escalate. Many forms of censorship begin gradually: rising latency, packet loss spikes, increasingly unstable TCP handshake success rates, or declining DNS resolution reliability. These early signals shift systematically before full disruption occurs.

Regression enables us to model these signals as continuous variables and forecast their trajectories. For example, consider the following simulated dataset modeled after OONI’s TCP blocking and latency measurements. In deployment, these values correspond to tcp_connect, http_requests, or quic_handshake anomalies. Below we generated 60 days of synthetic signal measurements resembling a network undergoing gradually intensifying interference.

The figure below visualizes the time series of blocking probability. Notice the oscillatory pattern with underlying drift behaviors commonly observed in regions under intermittent filtering pressure.

This trend alone is informative, but regression helps us quantify how blocking probability affects another critical variable: latency spikes. Using regression, we analyze the relationship between blocking probability and latency during the same period.

Even in this synthetic dataset, the correlation is visible: as blocking probability increases, latency rises in a non-linear fashion. Fitting a regression model, linear or non-linear, allows us to extrapolate expected latency or interference levels a few days ahead. In practice, this becomes an early-warning signal for users and VPN providers. When the model predicts that blocking probability will exceed a threshold within 48 hours, Neuroarcane alerts downstream systems to prepare countermeasures such as protocol rotation, decoy routing activation, or fallback tunnels.

Regression is the bridge between raw measurement fluctuations and predictive operational intelligence.

Classification

While regression helps forecast continuous drift, classification allows us to answer a different type of question: Has censorship occurred or not?

Supervised classifiers are trained on labeled historical events such as elections, protests, and regional conflicts to distinguish between organic network degradation and deliberate state-level tampering. For example, OONI’s historical datasets include labeled incidents where websites were blocked, DNS responses were altered, or TLS handshakes were tampered with. Combining these with additional datasets from IODA, which labels outage events, or CAIDA, which captures backbone anomalies, we can build a classifier that predicts the presence of censorship.

To demonstrate the mechanics, consider again our dataset. We convert blocking probability into a binary label representing whether an interference event is occurring. For instance, when blocking probability exceeds 0.7, we mark the instance as event = 1, otherwise event = 0. Using logistic regression, we model the decision boundary separating normal from anomalous behavior.

We also expand this feature set to include DNS consistency scores, TLS fingerprint changes, packet retransmission spikes, and protocol negotiation failures. The classifier trained on such inputs outputs a probability p(event = 1) for each measurement period, offering a soft prediction that feeds directly into Neuroarcane’s monitoring layer.

Neuroarcane’s Intelligence Pipeline

In Neuroarcane’s operational pipeline, regression and classification serve complementary roles in understanding the mechanics of network interference. Regression uncovers continuous degradation patterns and provides early forecasting of disruption, while classification identifies binary interference events that require immediate response.

Regression monitors slow-rising interference pressure, catching early warning signs such as latent throttling, high-latency filtering, or subclinical DNS anomalies. Classification detects sharp transitions, determining when these signals cross from “suspicious” to “active interference”. Output of both models feeds into Neuroarcane’s internal decision engine, which surfaces alerts, informs countermeasure protocols, and refines adaptive routing strategies.

We also incorporate richer datasets such as: OONI Web Connectivity, OONI app-specific interference signatures, RIPE Atlas traceroutes and ping failures, IODA BGP withdrawal signals and darknet traffic drops, and CAIDA Ark backbone measurement disruptions.

As Neuroarcane expands, both supervised learnings of regression and classification will serve as foundational analytical tools for internet interference detection and forecasting.

Supervised Learning

Supervised Learning at Neuroarcane

Regression

Classification

Neuroarcane’s Intelligence Pipeline

Recent Posts

Comments