Scouting & Strategy·Lesson 32 of 32

Engineering an Automated Analysis Pipeline

Tie everything together into a reproducible, mostly-automated pipeline that joins your scouting with TBA and Statbotics and outputs picklist-ready rankings.

The capstone

The advanced techniques only pay off if they run reliably at event pace. This lesson designs the end-to-end pipeline: scouting in, public APIs joined, validated metrics out, rebuildable in one command between match cycles.

Architecture

Keep stages decoupled so any one can be debugged or replaced:

Ingest: scanned QRScout rows append to a raw source (a Google Sheet or CSV). Append-only; never edit in place.
Enrich: a script pulls TBA (event_matches, event_oprs, event_rankings) and Statbotics (get_team_event) for the event, keyed by team number.
Compute: join scouting averages with OPR/EPA, compute custom metrics (per-level coral rates, endgame reliability, defense rating), and validation flags.
Output: a rank table and per-match prep sheets, regenerated on demand.

A concrete skeleton

import pandas as pd, tbapy, statbotics

EVENT = '2025cc'
tba = tbapy.TBA('YOUR_TBA_KEY')
sb  = statbotics.Statbotics()

# 1. Ingest scouting (exported from your QRScout sheet)
raw = pd.read_csv('scouting_raw.csv')
agg = raw.groupby('teamNumber').mean(numeric_only=True)

# 2/3. Enrich with public analytics
oprs = tba.event_oprs(EVENT)['oprs']           # {'frc254': 78.3, ...}
agg['opr'] = [oprs.get(f'frc{t}') for t in agg.index]
agg['epa'] = [ sb.get_team_event(int(t), EVENT).get('epa')
               for t in agg.index ]            # EPA at this event

# 4. Validation flags
agg['low_sample'] = raw.groupby('teamNumber').size() < 3
agg['scout_vs_opr_gap'] = (agg['ptsContributed'] - agg['opr']).abs()
agg.sort_values('ptsContributed', ascending=False).to_csv('rank.csv')

Use the Statbotics fields parameter and TBA simple/keys options to keep responses small and fast, and send the TBA ETag back as If-None-Match so repeated refreshes return a cheap 304 Not Modified.

Make it reliable at event pace

One command, idempotent. Running the pipeline twice gives the same result; you can rebuild anytime without manual cleanup.
Fail loud on validation. If low_sample or scout_vs_opr_gap is high for a top team, surface it so a super-scout is dispatched, instead of silently ranking on thin data.
Cache API calls. Store TBA/Statbotics responses locally per refresh so a flaky venue connection or rate limit does not block your rebuild; QR scouting already works offline, so your analysis should degrade gracefully too.
Keep a manual fallback. If the script breaks mid-event, the agg spreadsheet alone (from the Worked Examples module) still produces a usable ranking. Never let the fancy pipeline be a single point of failure.

What you have built

Put together with the earlier modules, this is a complete, defensible scouting operation: offline-first collection, measured scout accuracy, validated custom metrics, public-analytics cross-checks, predictive what-ifs, explicit defense evaluation, and a one-command rebuild that feeds picklists and pre-match briefs. That is the difference between a team that has data and a team that wins with it.

Key takeaways

Decouple the pipeline into ingest (append-only) -> enrich (TBA + Statbotics) -> compute (custom metrics + validation) -> output (rank + briefs).
Make rebuilds idempotent and one-command, cache API calls, and fail loud on low-sample or scouting-vs-OPR gaps to trigger super-scouts.
Always keep a manual spreadsheet fallback so the automated pipeline is never a single point of failure at an event.

Go deeper

Lesson quiz

Required

Answer all 3 questions correctly to complete this lesson.

1.An automated FRC analysis pipeline commonly pulls official match and event data from which source?

2.When authenticating to The Blue Alliance API v3, how should a pipeline supply its Read API key?

3.TBA recommends placing the auth key in the request header rather than the URL query string primarily because:

Answer every question to submit.