Scouting & Strategy·Lesson 23 of 32

Bad Data at the Source: Mislabeled, Missing, and Miscounted

The three data-entry failures that corrupt everything downstream, and the form and process fixes that stop them.

The garbage-in problem

No analysis survives bad inputs. Three errors account for most corrupted scouting datasets, and all three are preventable at the form and process level.

1. Wrong team (mislabeling)

A scout watches robot 1678 but records it under 1768. Now both teams' averages are wrong and you may never notice. Fixes:

Make team selection a pick-from-schedule field, not free text. In QRScout use the TBA-team-and-robot field type; in ScoutingPASS (PWNAGE Robotics, FRC 2451) pre-load the event schedule. The scout taps "Red 2" and the app knows the team number.
Assign each scout a fixed station (e.g., "always Red 2") for a whole match block, so attention is on one robot, not on remembering numbers.

2. Missing matches

A scout misses a match (bathroom, dead tablet, confusion in the stands) and you get a hole. One missing match in a 3-match sample swings an average wildly. Fixes:

Run a completeness check after every few matches: count rows per match; you should have exactly 6 (one per robot). A pivot table COUNTIF(match) that highlights any match without 6 entries catches gaps in seconds.
Keep paper backups at every station so a scout with a dead device keeps recording.

3. Miscounting

During a fast REEFSCAPE cycle a scout records 4 L4 coral when the robot scored 5, or attributes a partner's coral to the wrong robot. Fixes:

Tap-as-it-happens counters (QRScout counter/multi-counter) beat tallying from memory after the buzzer.
Reduce cognitive load: one scout tracks coral, another tracks algae/endgame if you have the people, rather than one scout tracking everything.

A debugging workflow when numbers look wrong

When a team's stats look implausible, do not delete data; trace it:

Reproduce: pull that team's raw rows and find the outlier match.
Cross-check against TBA: compare the alliance total your scouts recorded for that match against the TBA score_breakdown. If the alliance sum is off by far more than match-to-match noise, a scout miscounted or mislabeled.
Isolate the scout: which station's row is the outlier? Look at that scout's other matches; a systematic bias (always low on L4) points to a training gap, not a one-off.
Fix the cause, not the cell: retrain or reassign the scout, add a validation rule, then re-run aggregation. Editing one number without finding the cause guarantees the error returns.

Build validation into the sheet

Add a column that flags impossible values: more L4 coral than the 12 branches a reef level physically has, a negative count, or an alliance scouted-sum wildly above the real match max. Conditional formatting that turns these red means errors announce themselves instead of hiding in averages.

Key takeaways

The three killers are wrong-team labeling, missing matches, and miscounting; each has a specific form/process fix.
A per-match completeness check (exactly 6 rows) and a TBA score_breakdown cross-check surface holes and outliers fast.
Debug data by tracing the outlier to a scout/station and fixing the root cause, never by silently editing the cell.

Go deeper

Lesson quiz

Required

Answer all 3 questions correctly to complete this lesson.

1.In scouting, the principle "garbage in, garbage out" means that:

2.Which is the most effective way to prevent bad data from entering at the source during match scouting?

3.A scout consistently logs cycles for the wrong robot because the match's red/blue station assignments were misread. This is best described as a:

Answer every question to submit.