Tracking Pitches for Broadcast Television
Total Page:16
File Type:pdf, Size:1020Kb
COMPUTING PRACTICES Tracking Pitches for Broadcast Television A cable network’s desire to enhance its baseball telecasts resulted in K Zone, a computerized video tracking system that may have broader applications. André n baseball, a pitcher’s fame and fortune During a baseball game, dramatic changes in light- Guéziec depend on his mastery of the strike zone. ing conditions and the movement of objects and Triangle Software Pitches that pass outside the strike zone players can result in a shifting pattern of light and count as balls and can be safely ignored. color that makes it especially difficult to track a Those that pass through it untouched, how- pitched ball. Further, several ballparks have a net Iever, count as strikes, three of which will retire the in place behind home plate, which contributes fur- batter to his team’s dugout. Players, fans, and ther to the visual clutter that the image-process- sports journalists thus have an intense interest in ing system must filter out when tracking the compiling statistics about these pitches—as do the baseball. umpires who determine each pitch’s status when it Meeting these challenges required developing a crosses the plate. complex system that fuses high-end computer During the 2001 major league baseball season, the graphics with a sophisticated algorithm for calcu- strike zone received special attention when officials lating flight trajectories. The ESPN K Zone system decided to enforce the game’s original strike zone def- uses computer-generated graphics to create a inition, placing the zone’s upper limit between the shaded, translucent box that outlines the strike zone batter’s shoulders and belt. In the past, major-league boundaries for viewers. Behind the flashy graphics, umpires had rarely called a strike above the belt. K Zone—named after a synonym for the strike League officials and journalists thought that the effect zone—is a sophisticated computing system that of enforcing the original definition would be so sig- monitors each pitch’s trajectory.1 nificant it might change the hierarchy of hitters and pitchers. Further, 2001 turned out to be a particu- K ZONE TAKES SHAPE larly exciting year for baseball as Barry Bonds pur- In February 2001, ESPN contracted with Sport- sued—and ultimately surpassed—Mark McGuire’s vision to build a system for analyzing baseball single-season home run record set in 1998. pitches during its Major League Baseball broad- These developments made tracking pitches accu- casts. ESPN wanted a system that would determine rately more important than ever. Tracking the flight electronically, within one to two centimeters, of a pitch during a live broadcast presents two major whether each pitch qualified as a strike or a ball. challenges, however: speed and image-processing The system would then draw a representation of reliability. Speed is an issue because ensuring rapid the strike zone on the TV screen, superimposed over calculation of the trajectory practically requires real- the replayed broadcast video, to clearly show the time processing of the 60-fields-per-second video. pitch’s status. ESPN chose Sportvision for the pro- Ensuring image-processing reliability, on the ject because of the company’s track record in graph- other hand, requires overcoming several obstacles. ically enhancing sports broadcasts. 38 Computer 0018-9162/02/$17.00 © 2002 IEEE Figure 1. K Zone during a televised game. The pitch- tracking effect is an integral and unobtrusive part System overview of the telecast. Figure 1 shows K Zone in action during a televi- sion broadcast. ESPN insisted that the effect appear on the program video as an integral part of the scene, not as a separate graphical animation. To fulfill this requirement, the developers minimized the graphics so that they would not obscure any part of the game. The overall broadcast enhancement system uses three subsystems to produce the final televised graphics: cameras that observe each pitch. An operator uses • The camera pan-tilt-zoom encoding subsystem a third camera and PC to locate the strike zone’s calibrates the broadcast cameras in real time. top and bottom boundaries. All these components • The measurement subsystem detects the base- fit conveniently in one short equipment rack. ball’s trajectory, measures the batter’s stance, During operation, each PC processes the video and determines if the pitch is a strike or a ball. for one camera in real time. The processing uses a • The graphic overlay subsystem uses these mea- four-way multithreaded software architecture. One surements to produce the televised graphics. thread reads the video frame into memory, a sec- To draw them in the proper position, this sub- ond displays the video, a third handles the image system needs the real-time calibration data that processing, and the fourth writes the video to disk. the camera subsystem provides. We tested the pitch-tracking system extensively before using it in its broadcasts. Technicians The trajectory component, which consists of three checked various camera locations for tracking the PCs connected to three video cameras, tracks a baseball, selected views that permitted the most pitched baseball’s flight toward the strike zone. Two reliable detection, and refined the tracking algo- cameras observe the baseball, while the third observes rithm. These tests took place over several weeks the batter to provide proper sizing for the strike zone. early in the season at baseball games played in For calibrating the broadcast cameras, techni- Oakland, Minneapolis, and New York. cians install an encoder on each camera that mea- sures the pan and tilt angles, zoom voltage, and TRACKING CONSTRAINTS zoom extender positions. The encoders collect these To accomplish pitch tracking, the developers measurements 30 times per second and transmit needed to deal with four primary constraints. them to the graphic overlay subsystem. The graphic overlay subsystem renders a graphic Performance and superimposes it on the broadcast video. This Full-resolution digital NTSC (National Tele- graphic consists of two video streams, the fill, which vision System Committee) or PAL (phase alternat- contains the actual graphic, and the key, which con- ing line) video requires 270 Mbits per second of tains the transparency map that indicates the video bandwidth. Importing, displaying, and exporting pixels the graphic affects. These two streams are this data in real time takes several passes through input to the linear keyer, a piece of video equipment the personal computer’s PCI bus, stretching it that overlays the graphic on the broadcast video. nearly to capacity. Doing all this data transmission The graphic overlay subsystem uses an SGI O2 com- in real time requires carefully optimized software puter to draw a three-dimensional representation engineering. At the very least, multithreading is of the strike zone in the position that the broadcast essential to keep the CPU working on the video- camera’s pan, tilt, and zoom parameters specify. processing pipeline while waiting for the next video Although Sportvision had used the camera and field or frame to arrive. The system then decom- graphic-overlay systems in their broadcasts for sev- poses the video-processing pipeline into tasks exe- eral years, using them with K Zone required mod- cuted independently in a thread-safe manner. ifications. The measurement subsystem had to be built from scratch. Real-time operation Although ESPN planned to use the system for Measurement subsystem replays, we designed the image-processing pipeline K Zone’s measurement system uses two Pentium to work in real time to keep pace with the video 4 PCs running Windows 2000, linked to two video frame rate, with a delay of two seconds. Such a design March 2002 39 Figure 2. Broken trajectory mapping. This ultimately unsuccessful algo- rithm seeks the baseball pattern in the color image, meant that the video would contain 60 fields—or resulting in the half frames—per second. The interlacing, which is scattered ball posi- the television standard, displays two fields in alter- tions that the red nating lines on a frame and thus represents two dif- squares denote. The ferent moments in time. algorithm became Each camera has a field of view that covers about ineffective when half the baseball’s flight. As a consequence of the lighting conditions, relatively wide field of view, the baseball’s image the field, and team allows using the effect live, provided the program consists of only a few pixels. Depending on the colors combined to receives a two-second or longer delay when broad- view, background, foreground, and other factors, make parts of the cast. Sports broadcasts commonly apply such delays. the baseball can appear as no more than two pix- uniforms and back- Even when used for replays alone, the pitch- els after detection. In one view, the ball passes ground look more tracking system still needed to process video near—and often over—the white foul line, creat- like a baseball than quickly. Creating a successful replay required exe- ing a white-on-white image. the baseball itself. cuting several steps in rapid succession. In addition Several moving objects and shadows could be to processing the video, using the system required mistaken for the ball as well. The home plate coordinating with ESPN’s television operators to umpire, catcher, and batter typically stay immobile, cue the appropriate footage. So pressing were these then move swiftly and precisely when the ball is constraints that the show’s director would period- pitched, while the computers are busy detecting it. ically cancel replays for lack of time. Baseball uniforms typically have white or gray patches, such as a white handkerchief hanging from Reliability the umpire’s pocket. Helmets exhibit specular high- Image processing and computer vision have been lights that can be mistaken for the ball in some well-established academic fields since the 1970s.