Skip to main content

A Split‑Brain Neuro‑Symbolic Training Method for High‑Velocity Autonomous Coaching from Telemetry

 Author: Rabimba Karanjai
Scope: Problem statement + data methodology + model training (no deployment discussion)

Abstract

Real‑time coaching in motorsport is a safety‑critical learning problem: a system must map noisy, high‑frequency telemetry to short, actionable guidance that remains physically consistent and avoids hazardous recommendations.

This paper proposes a “Split‑Brain” training formulation that separates (i) a semantic coaching target (what action/critique should be expressed) from (ii) a reflexive interface (how actions are represented as compact, verifiable tokens). The approach trains a Small Language Model (SLM) in the Gemma family[1] using QLoRA fine‑tuning[2], and introduces a telemetry tokenizer plus teacher‑student synthesis pipeline to generate instruction‑action pairs at scale.

Core contribution: a reproducible method to convert “golden lap” differential telemetry into structured instruction‑tuning data, with an explicit safety-aware penalty that discourages physically contradictory cues.


1. Introduction

Driver coaching tools often visualize telemetry and highlight deltas, but they rarely produce reliable, context-aware micro‑instructions that can be executed immediately without distracting the driver.

A natural framing is demonstration learning: treat expert laps as demonstrations and learn a model that maps states (telemetry windows) to corrective actions, with the caution that naive supervised imitation can suffer from covariate shift and compounding errors when the learner encounters unseen states. [3][4]

Telemetry-based studies in sim racing identify performance-relevant signals (e.g., speed and acceleration features), supporting the premise that telemetry contains enough structure to support automated coaching labels and predictive models. [5]

2. Problem Statement

2.1 Inputs and outputs

Let \(x_{t-k:t}\) denote a telemetry window of length \(k\) ending at time \(t\), containing signals such as speed, steering, brake, throttle, lateral/longitudinal acceleration, yaw/yaw‑rate, and track position.

The model outputs a structured response \(y_t\) consisting of (i) a discrete action \(a_t\) from a finite vocabulary \(\mathcal{A}\) and (ii) an optional short rationale string intended for human interpretability.

2.2 Learning objective

Training minimizes negative log-likelihood of the reference response with a safety-aware regularizer:

\[ \min_{\theta} \; \mathbb{E}\Big[-\log p_{\theta}(y_t^* \mid x_{t-k:t})\Big] + \lambda \, \mathbb{E}\big[\Omega(y_t, x_{t-k:t})\big] \]

The key design goal is that \(\Omega(\cdot)\) penalizes action tokens that contradict a telemetry-defined safe set, while leaving the model free to choose among safe alternatives.

3. Data Methodology

3.1 “Golden lap” differential representation

Rather than learning from raw telemetry alone, supervision is built on differences between a novice lap \(N\) and an expert reference lap \(P\), so the prompt encodes what went wrong relative to a target trajectory.

Align \(N\) to \(P\) by track position \(s\) (preferred) and compute a differential state:

\[ \Delta(s) = \big[ v_N(s)-v_P(s),\; a^{lat}_N(s)-a^{lat}_P(s),\; a^{long}_N(s)-a^{long}_P(s),\; \psi_N(s)-\psi_P(s),\; \dot{\psi}_N(s)-\dot{\psi}_P(s) \big] \]

3.2 Telemetry tokenizer (feature-to-token mapping)

Because LLMs are trained over discrete token sequences, continuous telemetry is discretized into a compact vocabulary of “physics tokens” with controlled granularity.

Example token schema:

  • Speed delta: DV=+10mph, DV=-5mph (bin width configurable).
  • Lateral delta: DLAT=-0.2g (bin width configurable).
  • Longitudinal delta: DLONG=+0.3g (captures braking/throttle mismatch).
  • Rotation: DYAW=-3deg or DYAW_RATE=+6deg_s.
  • Context: SECTOR=3, CORNER=7, PHASE=ENTRY|MID|EXIT.

3.3 Teacher‑student synthesis (scalable labels)

To avoid hand-labeling at scale, a deterministic “teacher” flags divergence events and generates paired labels: a discrete action and a short rationale.

Define an error score:

\[ E_t = \lVert p_P(t) - p_N(t) \rVert_2 + \alpha \, \lvert \psi_P(t)-\psi_N(t)\rvert + \gamma \, \lvert v_P(t)-v_N(t)\rvert \]

For \(E_t > \epsilon\), emit a training pair \((\text{prompt}_t, \text{response}_t)\) where \(\text{prompt}_t\) is tokenized telemetry and \(\text{response}_t\) is a constrained structured output.

Algorithm: Synthesize coaching pairs from differential telemetry
Inputs: expert lap P, novice lap N, thresholds ε, tokenizers φ, label rules R
For each aligned position/time t:
  Δt ← compute_differentials(P(t), N(t))
  Et ← error_score(Δt)
  if Et > ε:
     prompt  ← φ(Δt, context(t))
     action  ← R.classify(Δt, context(t))      # finite action vocabulary
     reason  ← R.render_text(action, Δt)       # short template or learned paraphrase
     output  ← format("<action>{action}</action> <reason>{reason}</reason>")
     store(prompt, output)

4. Model & Training

4.1 Base model

The coaching model is a small, instruction-tuned language model from the Gemma family to support domain adaptation on limited resources. [1]

4.2 Parameter‑efficient fine‑tuning (QLoRA)

Fine‑tuning uses QLoRA, which trains low‑rank adapters while keeping a quantized base model frozen, enabling efficient adaptation with reduced memory usage. [2]

Report the following for reproducibility:

  • Checkpoint: exact Gemma variant and whether it is pretrained or instruction-tuned.[1]
  • Quantization: 4-bit training quantization configuration used for QLoRA.[2]
  • Adapters: \(r\), \(\alpha\), dropout, and the targeted projection modules.
  • Data: window length \(k\), bin sizes, number of pairs, and the final action vocabulary.

4.3 Safety-aware penalty

Define a telemetry-derived safe action set \(\mathcal{A}_{safe}(x_{t-k:t}) \subseteq \mathcal{A}\) computed by deterministic constraints (e.g., prohibit throttle-up cues during heavy deceleration windows).

A simple instantiation is:

\[ L(\theta) = - \sum_{t} \log p_\theta(y_t^{*}\mid y_{<t}^{*}, x_{t-k:t}) + \lambda \sum_{t} \mathbb{I}\!\left[\hat{a}_t \notin \mathcal{A}_{safe}(x_{t-k:t})\right] \]

4.4 Structured response format

Responses are constrained to a strict schema so that evaluation can be performed with exact matching and rule-based checks.

<action>brake_later</action>
<reason>You are releasing the brake too early on entry; carry trail brake slightly longer.</reason>

5. Evaluation (Training‑Focused)

5.1 Offline metrics

  • Action accuracy: exact match of <action> on held-out examples.
  • Safety violation rate: \(\Pr(\hat{a}_t \notin \mathcal{A}_{safe}(x_{t-k:t}))\).
  • Phase confusion: misclassifications across ENTRY/MID/EXIT contexts.

5.2 Split strategy

In addition to random splits, evaluate by holding out entire laps/sessions (and ideally entire drivers) to test generalization under distribution shift, a known challenge in imitation-style learning setups. [3][4]


References

  1. Gemma Team. Gemma: Open Models Based on Gemini Research and Technology. arXiv:2403.08295, 2024. https://arxiv.org/abs/2403.08295
  2. Dettmers et al. QLoRA: Efficient Finetuning of Quantized LLMs. arXiv:2305.14314, 2023. https://arxiv.org/abs/2305.14314
  3. Hussein et al. A Survey of Demonstration Learning. arXiv:2303.11191, 2023. https://arxiv.org/pdf/2303.11191.pdf
  4. Codevilla et al. Exploring the Limitations of Behavior Cloning for Autonomous Driving. ICCV, 2019. PDF
  5. AI‑enabled prediction of sim racing performance using telemetry data. 2024. ScienceDirect

Cite this work

@misc{HighVelocityAI2025,
  title={A Split-Brain Neuro-Symbolic Training Method for High-Velocity Autonomous Coaching from Telemetry},
  author={Rabimba Karanjai, Austin Bennett, Ajeet Mirwani, Alvaro Huanca Mamani, Hemanth, Jesse Nowlin, Jigyasa Grover, Lynn Langit, Margaret M., Sebastian Gomez, Vikram Tiwari},
  year={2025},
  howpublished={Blog Post / Working Paper}
}

Comments

Popular posts from this blog

Deep Dive into the Google Agent Development Kit (ADK): Features and Code Examples

In our previous overview, we introduced the Google Agent Development Kit (ADK) as a powerful Python framework for building sophisticated AI agents. Now, let's dive deeper into some of the specific features that make ADK a compelling choice for developers looking to create agents that can reason, plan, use tools, and interact effectively with the world. 1. The Core: Configuring the `LlmAgent` The heart of most ADK applications is the LlmAgent (aliased as Agent for convenience). This agent uses a Large Language Model (LLM) for its core reasoning and decision-making. Configuring it effectively is key: name (str): A unique identifier for your agent within the application. model (str | BaseLlm): Specify the LLM to use. You can provide a model name string (like 'gemini-1.5-flash') or an instance of a model class (e.g., Gemini() ). ADK resolves string names using its registry. instruction (str | Callable): This is crucial for guiding the agent's be...

Curious case of Cisco AnyConnect and WSL2

One thing Covid has taught me is the importance of VPN. Also one other thing COVID has taught me while I work from home  is that your Windows Machine can be brilliant  as long as you have WSL2 configured in it. So imagine my dismay when I realized I cannot access my University resources while being inside the University provided VPN client. Both of the institutions I have affiliation with, requires me to use VPN software which messes up WSL2 configuration (which of course I realized at 1:30 AM). Don't get me wrong, I have faced this multiple times last two years (when I was stuck in India), and mostly I have been lazy and bypassed the actual problem by side-stepping with my not-so-noble  alternatives, which mostly include one of the following: Connect to a physical machine exposed to the internet and do an ssh tunnel from there (not so reliable since this is my actual box sitting at lab desk, also not secure enough) Create a poor man's socks proxy in that same box to have...

Build Smarter AI Agents Faster: Introducing the Google Agent Development Kit (ADK)

The world is buzzing about AI agents – intelligent entities that can understand goals, make plans, use tools, and interact with the world to get things done. But building truly capable agents that go beyond simple chatbots can be complex. You need to handle Large Language Model (LLM) interactions, manage conversation state, give the agent access to tools (like APIs or code execution), orchestrate complex workflows, and much more. Introducing the Google Agent Development Kit (ADK) , a comprehensive Python framework from Google designed to significantly simplify the process of building, testing, deploying, and managing sophisticated AI agents. Whether you're building a customer service assistant that interacts with your internal APIs, a research agent that can browse the web and summarize findings, or a home automation hub, ADK provides the building blocks you need. Core Concepts: What Makes ADK Tick? ADK is built around several key concepts that make agent development more s...