Skip to content
All articles

FRC Vision: Limelight vs PhotonVision and AprilTag Tracking

8 min read·

Imagine your robot knowing exactly where it sits on the field — to the centimeter — the instant a match starts, then driving itself to score without a human touching a joystick. That is not science fiction in modern FRC. It is what a camera, a handful of printed AprilTags, and a few lines of code can do. Vision is the single biggest force multiplier a software-focused team can add, and the two systems almost everyone uses are Limelight and PhotonVision. This guide explains what each one is, how they track AprilTags, and how that data becomes a robot pose you can actually drive with.

Why vision matters in FRC

Wheel encoders and a gyro give you odometry — a running guess of where the robot is based on how far the wheels have turned. The problem is drift: every wheel slip, every defender shove, every carpet seam adds error that compounds over a 2.5-minute match. By the time you reach the scoring zone, your "position" can be off by a foot or more.

Vision fixes this by giving the robot an absolute reference. There are two big jobs vision does:

  • Pose estimation — figuring out the robot's exact field position by looking at AprilTags whose locations are published in advance.
  • Game-piece detection — finding and aiming at the season's scoring object (a neural-network task, since game pieces have no convenient tag on them).

Get pose estimation working and you unlock auto-alignment, repeatable autonomous routines, and on-the-fly path correction.

What are AprilTags?

AprilTags are a visual fiducial system developed by researchers at the University of Michigan for low-overhead, high-accuracy localization. They look like simplified QR codes. Since 2024, FIRST has used the 36h11 family — a 6x6 grid of bits surrounded by a black-and-white border. FIRST places these tags at known positions around the field, so when your camera spots one, it can work backward to compute where the robot must be.

A single AprilTag detection gives you three things: the tag ID, a 3D pose of the camera relative to the tag (this requires a calibrated camera), and the precise pixel locations of the tag's corners. WPILib bundles the official field positions in an AprilTagFieldLayout, loaded from a JSON file. For the 2026 Rebuilt season there are even two layouts — the welded field and the AndyMark field — because the two field constructions differ slightly.

One catch worth knowing early: pose ambiguity. With a single tag, multiple real-world camera positions can project to nearly the same corner locations in the image, so the math sometimes can't tell which is correct. The fixes are seeing multiple tags at once, fusing with odometry history, or telling the camera your robot's heading. Both Limelight and PhotonVision have features built around solving exactly this.

Limelight: the hardware smart camera

Limelight is a smart camera — a self-contained unit with a lens, image sensor, and an onboard processor that does all the vision math inside the camera. You configure zero-code pipelines for color blobs, AprilTags, and neural networks through a built-in web interface, then read results over NetworkTables. Nothing extra to assemble.

The current lineup includes the Limelight 2+, 3, 3A, 3G, and 4. The differences matter:

ModelImage sensorBest resolution / FPSBuilt-in IMU
Limelight 3OV5647 color, rolling shutter90 fps @ 640x480No
Limelight 3GOV9281 mono, global shutter120 fps @ 1280x800No
Limelight 4OV9281 mono, global shutter120 fps @ 1280x800Yes

The jump to a global shutter sensor (3G and 4) is significant: global shutters capture the whole frame at once, so AprilTags stay crisp even when the robot is moving fast, while rolling shutters smear them. The Limelight 4 also adds a built-in IMU (for the MegaTag2 algorithm described below) and supports a Hailo-8 accelerator for YOLOv8 object detection at up to 80 fps. Note the LL4 accepts a 5V-24V input (35V absolute maximum) but dropped Power-over-Ethernet support, so plan your wiring on the electrical side accordingly.

MegaTag and MegaTag2

Limelight's localization secret sauce is MegaTag. The original MegaTag (MT1) combines the corners of all visible tags into one pose, which beats averaging individual single-tag estimates and reduces ambiguity. MegaTag2 (MT2) goes further: if you feed it your robot's heading every frame via LimelightHelpers.SetRobotOrientation(), it produces a stable, ambiguity-free pose even from a single tag at long range. You read the result from the NetworkTables key botpose_orb_wpiblue (blue-origin coordinates) or through the helper getBotPoseEstimate_wpiBlue_MegaTag2(). The tradeoff: MT2 trusts the heading you give it, so your gyro must be accurate.

PhotonVision: free software on a coprocessor

PhotonVision is free, open-source vision software you flash onto a small Linux computer — a coprocessor — that you supply yourself. Common choices are an Orange Pi 5, a Raspberry Pi, or a Rubik Pi 3, paired with a USB or CSI camera such as an Arducam OV9281. You get the same web-based tuning interface and AprilTag pipelines, but you pick the hardware, which means you control the cost and can run multiple cameras.

PhotonVision's answer to ambiguity is MultiTag localization: it solves a single Perspective-n-Point (PnP) problem across every visible tag's corners on the coprocessor, producing one robust field-relative pose. This is the recommended approach for all teams because it is the most accurate.

For game-piece detection, PhotonVision runs neural-network object detection, but only on coprocessors with a dedicated NPU (neural accelerator) — currently the Orange Pi 5 (using RKNN model format) and the Rubik Pi 3 (using TensorFlow Lite). The software ships with a season-specific model; for 2026 that detects the FUEL game piece. Frames are letterboxed to 640x640 before inference, and each detection returns a bounding box, a class, and a unitless confidence score from 0 to 1.

PhotonPoseEstimator in code

On the robot side you use PhotonPoseEstimator, one per camera. You construct it with the AprilTagFieldLayout and a Transform3d describing exactly where the camera sits relative to the robot center. The recommended strategy is Coprocessor MultiTag (via estimateCoprocMultiTagPose), which combines all visible tag corners; you can supply a fallback such as estimateLowestAmbiguityPose for when only one tag is visible. The estimate methods return an Optional<EstimatedRobotPose> — empty when no tags are visible — containing an estimatedPose (Pose3d) and a timestampSeconds for latency compensation.

How vision feeds pose estimation

Both systems ultimately hand WPILib a pose with a timestamp, and WPILib fuses it with your odometry using a pose estimator. There are three, matched to your drivetrain: SwerveDrivePoseEstimator, DifferentialDrivePoseEstimator, and MecanumDrivePoseEstimator. They are drop-in replacements for the plain odometry classes and run a Kalman filter under the hood.

The pattern is two calls:

  • update() — called every loop with your gyro and encoder data to track position continuously.
  • addVisionMeasurement(pose, timestamp, stdDevs) — called whenever a vision pose arrives to snap odometry back toward the truth. It applies latency compensation automatically using the timestamp.
var result = photonEstimator.estimateCoprocMultiTagPose();
result.ifPresent(est ->
    swervePoseEstimator.addVisionMeasurement(
        est.estimatedPose.toPose2d(),
        est.timestampSeconds));

The stdDevs (standard deviations) are how much you trust a measurement: smaller numbers mean "believe this more." A close, multi-tag measurement deserves low standard deviations; a far, single-tag one deserves high values so the filter leans on odometry instead. WPILib's defaults are 0.9 for vision x, y, and heading, versus 0.1 for the model states — so out of the box odometry is trusted more than vision. Tuning these well is what separates a robot that snaps cleanly to its target from one that jitters.

Latency, mounting, and calibration

Three practical things make or break vision:

  • Latency — vision data is always a little old. Always pass the real capture timestamp to addVisionMeasurement() so WPILib rewinds odometry to when the picture was actually taken.
  • Mounting — the Transform3d from robot center to camera must match reality precisely; a few degrees of tilt error becomes large position error at distance. Mount the camera rigidly. Coordinate this with your mechanical build.
  • Calibration — every camera+resolution combo must be calibrated to remove lens distortion before 3D poses are trustworthy. Both tools have guided calibration in their web UIs.

Limelight vs PhotonVision: the comparison

FactorLimelightPhotonVision
What it isHardware smart cameraFree software on your own coprocessor
CostHigher upfront (camera price)Software free; pay only for Pi + camera
Ease of setupEasiest — plug in and goMore setup (flash, wire, calibrate)
FlexibilityFixed hardwareChoose camera, run many cameras
Pose algorithmMegaTag / MegaTag2MultiTag PnP on coprocessor
Object detectionBuilt-in (Hailo on LL4)NPU coprocessor only (Orange Pi 5 / Rubik Pi 3)
Best forTeams wanting reliability fastTeams wanting control and lower cost

There is no wrong answer. Limelight gets a rookie team aiming at AprilTags in an afternoon. PhotonVision rewards teams willing to learn Linux with a cheaper, more flexible multi-camera setup. Both produce the same end product: a timestamped pose your SwerveDrivePoseEstimator can fuse.

Ready to wire vision into your robot code? Start with the LearnFRC Programming track.

Keep reading

Learn every department of FRC — free

393+ structured lessons, quizzes, and team tools. Built by an FRC student, for the community.

Browse the guides