Write an Executive Summary based on an article of your choosing from a pre-selected group given by the instructor. The target audience is your peers. The level of technical detail provided in the paper should be commensurate with the general knowledge exp

timer Asked: Feb 9th, 2019
account_balance_wallet $30

Question Description

Topic: Technical Journal Article Executive Summary

Goal and Purpose of the Paper

Write an Executive Summary based on an article of your choosing from a pre-selected group given by the instructor. The target audience is your peers. The level of technical detail provided in the paper should be commensurate with the general knowledge expected of others in your discipline. The paper should be designed around the specific technical journal article selected. Please use an expository rhetorical mode that informs and describes the article clearly. You are NOT required to include additional outside references. Take special note of the important results/conclusions in the article.

Paper Specifications

The scope (breadth) and level of detail (depth) of the paper should be carefully managed so that the amount of information and level of detail is balanced for both the given length of the paper, the target audience, and type of information being presented in the paper. The executive summary should be written to the following general specifications:

PAGES – 2 pages maximum

FORM – (organization and hierarchy of the information presented in your paper)

The paper should contain an introductory paragraph followed by the main body of text, which should be appropriately subdivided using descriptive first and second-level headings as needed to indicate your chosen hierarchy of information. The paper should end with a brief conclusion that ties the paper together and brings closure to your selected topic.

FORMAT – (visual cues and aesthetic appeal)

The paper should be set up with the following formatting parameters:

  • Body text must be double-spaced and left-justified (not full-justified)
  • Font size should be 12-point (may vary slightly with the particular chosen font)
  • Margins should be no larger than one inch (left, right, top, and bottom)
  • Headings should be appropriately sized to indicate the first and second-level status
  • Body text pages must be numbered (page one begins on the first page of body text)

The finished research paper must include the following sections:

  • Cover page – descriptive title, the author’s name, and ID, course name, etc.
  • List of Figures, if any Not needed for this paper, can include if deemed necessary.
  • List of Tables, if anyNot needed for this paper, can include if deemed necessary.
  • Body text – introduction, body, conclusion, including first and second-level descriptive headings (if necessary), and numbered pages as required
  • Works Cited page: Not needed for this paper.
  • STYLE – the paper should reflect a formal, scientific tone (avoiding unnecessary use of first-person pronouns (I, me, my, we, ours, us) when possible and be written to an audience of your peers.

Submitting the Completed Paper

Upload an electronic copy of the final draft in PDF format (*.pdf) by the due date.

Assignment Grading Criteria

The paper will be graded based on the UCI Upper Division Writing Rubric discussed in class.

Unformatted Attachment Preview

J Real-Time Image Proc (2007) 2:117–132 DOI 10.1007/s11554-007-0041-1 SPECIAL ISSUE Real-time camera tracking using sports pitch markings Graham Thomas Received: 1 June 2007 / Accepted: 23 August 2007 / Published online: 10 October 2007  Springer-Verlag 2007 Abstract When broadcasting sports events such as football, it is useful to be able to place virtual annotations on the pitch, to indicate things such as distances between players and the goal, or whether a player is offside. This requires the camera position, orientation, and focal length to be estimated in real time, so that the graphics can be rendered to match the camera view. Whilst this can be achieved by using sensors on the camera mount and lens, they can be impractical or expensive to install, and often the broadcaster only has access to the video feed itself. This paper presents a method for computing the position, orientation and focal length of a camera in real time, using image analysis. The method uses markings on the pitch, such as arcs and lines, to compute the camera pose. A novel feature of the method is the use of multiple images to improve the accuracy of the camera position estimate. A means of automatically initialising the tracking process is also presented, which makes use of a modified form of Hough transform. The paper shows how a carefully chosen set of algorithms can provide fast, robust and accurate tracking for this real-world application. Keywords Camera tracking  Camera calibration  Pose estimation  Football  Soccer 1 Introduction One key aim of a broadcaster covering a sports event is to help ‘‘tell a story’’ to the viewers, by giving insight into G. Thomas (&) BBC Research, Kingswood Warren, Tadworth, Surrey, UK e-mail: graham.thomas@rd.bbc.co.uk URL: http://www.bbc.co.uk/rd what is happening, and analysing the action. To help with sports analysis, a common requirement is to be able to overlay graphics on the image, which appear to be tied to the ground. It is also useful to be able to show markings at the correct absolute scale, such as distances from a player to the goal. This requires knowledge of the camera pose (position and orientation), as well as the focal length (or zoom), i.e. a full metric camera calibration. The calibration data generally needs to be generated at full video rate (50 or 60 Hz). Examples of some typical overlaid graphics are shown in Fig. 1. The camera pan, tilt and zoom need to be estimated to a sufficient accuracy so that overlaid graphics appear stable and accurate to around 1 pixel or better in the image. However, there is generally no need to make a real-time estimate of the camera position, roll, or lens distortion: Cameras at sports events are typically mounted on fixed pan/tilt heads (Fig. 2). Although the principal point of the lens may move by 20–30 cm as the camera pans (as the axis about which it pans is generally a little way behind the lens), this movement is usually negligible compared with the scale of the graphics being added (which may have line thicknesses equivalent to around 0.5 m as shown in Fig. 1). The camera mount prevents the camera from rolling (i.e. it cannot rotate about its direction-of-view). Although there will usually be a small amount of lens distortion, particularly at wide angles, most commercial real-time rendering software currently does not correct for it, so there is generally no need to estimate it for applications involving realtime graphics insertion. Whilst real-time graphics overlay is the primary application considered in this paper, there are other applications where more accurate camera calibration is needed, for example multi-camera 3D reconstruction [5]. For these, an estimation of lens distortion and principal point position are likely to be needed. 123 118 J Real-Time Image Proc (2007) 2:117–132 Fig. 1 Graphics overlaid using camera pose data Fig. 2 A TV camera at the Twickenham Rugby Stadium, UK One way in which camera calibration data can be derived is by performing an initial off-line calibration of the position of the camera mounting using a theodolite or range-finder, and mounting sensors on the camera and the lens to measure the pan, tilt, and zoom. However, this is costly and sometimes very difficult, for example if it is not possible to gain access to the camera mount, or if the camera is mounted on a non-rigid structure such as a crane. A more attractive way of deriving calibration data is by analysis of the camera image. Camera calibration is typically carried out using a calibration object having easily identifiable features in accurately known locations, for example a planar checkerboard pattern [14]. The lines on a sports pitch are usually in known positions, and these can be used to compute the camera pose. In sports such as football, the layout of some pitch markings (such as those around the goal) is fully specified, but the overall dimensions vary between grounds. For example, the English Football Association specifies that the pitch length must be in the range 90–120 m, and the width 45–90 m; for international matches, less variation is allowed (length 100– 110 m and width 64–75 m) [4]. It is thus necessary to obtain a measurement of the actual pitch. One example of past work in this area is [16], in which an exhaustive search over position, orientation and field-ofview is performed to match a wire-frame model of the pitch lines to the lines detected in the camera image. Although this method requires no manual initialisation, the exhaustive search approach suggested is likely to result in 123 processing times that are too long to be practical for applications in sports graphics generation, where the system needs to initialise in about 1 s, and track at 50–60 frames per second. While the quantisation inherent in the searching process in [16] may be sufficient for the application described (gathering statistics on player position), it is unlikely to result in a camera pose that has sufficient accuracy for convincing graphical overlay. An example of a method that computes the camera pose by iterative minimisation of the error between observed lines and reprojected lines from a model is [15]. This method works by locating edge points in the image close to the predicted edge position, but no method of automatic initialisation is presented. It is stated that the method tracks with a small amount of jitter, which can be reduced by using texture information for tracking in addition to the edge information. Other researchers have investigated the detection of specific kinds of line features that occur in football. For example, [17] presents a method for detecting the ellipse in the image formed by the centre circle, although this only works when more than half of the circle is visible, and makes no use of other line features in the image. Billboards around the pitch can also be tracked [1], although many football grounds now have animated or scrolling billboards, which would make such an approach unsuitable. In this paper, we propose a method based on line tracking, similar to [15] in that the camera pose is computed to minimise the reprojection error to observed edge points. We also use a variant of the multi-hypothesis approach that [15] describes: only those edge points closest to the predicted line position are considered, rather than using all points within a given search area. This provides robustness to the appearance of other nearby edge points. However, our method includes an automatic initialisation process, similar in concept to the exhaustive search of [16], but implemented in such a way that the process can be carried out in about 1 s. We also take advantage of the fact that TV cameras at outside broadcasts are often mounted on fixed pan/tilt heads, so that their position remains roughly constant over time. This paper extends the results previously reported in [13]. An overview of the whole process is given in Fig. 3. J Real-Time Image Proc (2007) 2:117–132 Initial estimation of positions of cameras (Section 2) Carried out once per football match (or re-use from last match at same ground) • Uses multiple images Initialisation (Section 3) Carried out each time tracking is started • Locate pixels on pitch lines • Compute spatialised Hough transform • Compute match value for all poses by summing Hough bins • Choose pose giving highest match value (after refinement) Tracking (Section 4) Carried out once per image (50Hz or 60Hz) • Project lines of pitch model into image using previous pose • Identify pixels that match predicted line width and orientation that lie close to projected lines • Fit lines through identified pixels • Compute pose that minimises reprojection error between fitted lines and projected lines Fig. 3 Overview of the method The following section describes the method used to estimate the position of each camera mounting, when the system is first set up. Section 3 explains the initialisation method, which is invoked when starting to track, or when the system has lost track (for example, if the camera view moved away from showing any pitch lines). Section 4 discusses the real-time tracking process. Section 5 presents some results, and the remaining sections present discussions and conclusions. 2 Estimation of the camera position Most cameras covering events such as football generally remain in fixed positions during a match. Indeed, the positions often remain almost unchanged between different matches at the same ground, as the camera mounting points are often rigidly fixed to the stadium structure. A typical camera mounting was shown in Fig. 2. It therefore makes sense to use this prior knowledge to compute an accurate camera position, which is then used as a constraint during the subsequent tracking process. 119 Estimating camera position from image correspondences can be a poorly constrained problem, particularly if focal length also needs to be estimated. This is because the effect on the image of a small change in focal length is similar to a translation of the camera along its direction of view. If the variation of depth in the image is small compared to the distance from the camera to the scene (for example, in the case of a planar scene whose normal is nearly parallel to the direction of view), the two effects are essentially indistinguishable. This is why some camera calibration methods such as [18] propose the use multiple images of an inclined planar calibration object in order to improve their accuracy: they constrain the internal parameters of the camera (focal length and lens distortion) to be the same in all images, but allow the external parameters (position and orientation) to vary. A single optimisation process estimates the common internal parameters and the separate external parameters for all images. In our application, we need to compute the orientation and focal length of images in real time as they arrive, so it is not immediately apparent that an approach using multiple images is suitable. However, as the main cause of uncertainty comes when trying to estimate both the focal length and position simultaneously from a single image, we can get the benefits of a multiple-image approach by pre-computing the camera position using multiple images grabbed with a wide range of pan and tilt angles, and constraining the position to this estimated value when computing pan, tilt and focal length using an individual image during real-time tracking. The pose computation method described in Sect. 4 is used to compute the camera position, orientation and fieldof-view, for a number of different camera orientations, covering a wide range of pan angles. The pose for all images is computed in a single optimisation process, constraining the position to a common value for all the images, but estimating separate values of pan, tilt and focal length for every image. This significantly reduces the inherent ambiguity between the distance of the camera from the reference features and the focal length, even though the focal length is allowed to vary in each image. By including views of features in a wide range of positions (e.g. views of both goal areas), the uncertainty lies along different directions, and solving for a common position allows this uncertainty to be significantly reduced. Examples of the position estimation process are given in Sect. 5.1. Mounted cameras usually cannot roll (i.e. rotate about their direction of view). However, this does not necessarily mean that the roll angle can be assumed to be zero. The camera may not necessarily be mounted flat on the pan/tilt head, or the head itself may not be aligned with the pan axis exactly vertical. Also, it is usually assumed that the plane of the pitch is horizontal, but this may not be the 123 120 case. Each of these effects can give rise to what appears to be a small amount of camera roll, which could vary with pan or tilt, depending on the cause. One solution would be to compute camera roll for every image during the tracking process, but this introduces an additional degree of freedom, and therefore will increase the noise sensitivity and increase the minimum number of features needed for accurate pose computation. Another option would be to attempt to solve for each of these small mis-alignments separately. Instead, we chose to solve for the apparent rotation of the pitch plane about the dominant direction of the camera view (for example, allowing the pitch to rotate about the z axis for a camera whose direction-of-view when looking at the centre of the pitch is closely parallel to the z axis). We found that in practice this accounts sufficiently well for the combination of effects described earlier. The pitch rotation is computed during the global position computation process, giving a single value optimised for all the images used. Although it is generally best to apply this technique by manually selecting around 10–20 images from each camera (for example, by picking images from a video tape recorded from a previous game at the ground, that show a large number of lines and cover a wide range of pan and tilt values), we have developed a method whereby images can be acquired automatically to refine the position computation during the tracking process. The camera position and orientation are first set manually to give good alignment between actual and projected pitch lines for a given image. The tracking process (Sect. 4) then computes new values of pan, tilt and roll for every image, using the previous pose as an initial estimate. Each time a pose is computed that has values of pan or tilt that are significantly different from those of images used previously (typically by about 10), the image is added to the list of images used for global position computation, and a new globally consistent position and pitch plane rotation are calculated, incorporating the newly captured image. This allows the system to automatically refine its estimated position during tracking. The camera position and pitch rotation computed in this way are then used for the initialisation and tracking processes described below. J Real-Time Image Proc (2007) 2:117–132 roughly, it is possible to predict which peak in Hough space corresponds to which known line in the world, and hence to calibrate the camera. However, if the camera pose is unknown, the correspondence can be difficult to establish, as there may be many possible permutations of correspondences. Furthermore, if some lines are curved rather than straight, they will not give rise to a well-defined peak and are thus hard to identify. Rather than attempting to establish the correspondence between world lines and peaks in Hough space, we use the Hough transform as a means to allow us to quickly establish a measure of how well the image matches the set of lines that would be expected to be visible from a given pose. A ‘‘match value’’ for a set of lines can be obtained by adding together the set of bins in Hough space that correspond to the lines we are looking for. Thus, to test for the presence of a set of N lines, we only have to add together N values from the Hough transform, rather than examining all the pixels in the image that we would expect the lines to lie upon. We use this in an exhaustive search process, to establish the match value for each pose that we consider. This provides a much faster way of measuring the degree of match than that used in the exhaustive search process proposed in [16]. For each pre-determined camera position, we search over the full range of plausible values of pan, tilt, and fieldof-view, calculating the match value by summing the values in the bins in the Hough transform that correspond to the line positions that would be expected. By representing a curved line as a series of line segments, curves can also contribute to the match, even if they do not give rise to local maxima in the Hough transform. We used one segment for every 20 of arc. Although specific forms of Hough transform exist for circle or ellipse detection (such as that used in [17]), we chose the line segment approach to allow both curves and lines to be handled in a single process. The following two sub-sections explain the kind of Hough transform used, and give further implementation details. 3.2 A variant on the Hough transform that maintains spatial information 3 Initialisation 3.1 Approach The Hough transform [7] is a well-known way of finding lines in an image. It maps a line in the image to a point (or accumulator ‘‘bin’’) in Hough space, where the two axes represent the angle of the line and the shortest distance to the centre of the image. If the camera pose is known 123 A point in a conventional Hough transform represents a line of infinite length, i.e. the information about which part of the line a contributing point lies on is lost. This means that, when measuring the degree of match for a line segment from the value in the corresponding Hough bin, all edge pixels that are co-linear with this segment will be considered, even if they lie beyond the ends of the line segment. This makes the match value less reliable, as noise J Real-Time Image Proc (2007) 2:117–132 samples, or samples from other lines, will contribute when they should not. In particular, the peak in the Hough transform from a short line segment (or a short segment of a curve) may be no higher than a peak caused by samples from several other line segments and from the limbs of a player, that coincidentally happen to be co-linear. That is, the number of pixels giving a significant output from the line detector that lie along a short genuine line segment may be no larger than the number that lie on a longer ‘imaginary’ line that passes through players and parts of other lines. The presence of a short line segment could thus be incorrectly inferred in this situation. The information that the genuine peak came from samples in a specific localised area, whilst the other came from a spatially diverse area, is lost in a conventional Hough transform. Various methods of incorporating spatial information in a Hough transform have been proposed before; for example, [6] describes an approach in which the input image is divided into a quad-tree, so that the lines in a particular region can be identified. We chose a relatively simple approach to retaining spatial information, which we refer to here as a spatialised Hough transform. Rather than sub-dividing the image into 2D regions, we divide each line into S 1D segments: this maintains a common set of bins for the whole image, with each bin being sub-divided into S sections. Rather than divide every line into S equal sections, for simplicity we divide the line by reference to either the horizontal portion of the image in which it lies (for lines that are closer to horizontal than vertical), or the vertic ...
Purchase answer to see full attachment

Tutor Answer

School: Cornell University



Technical Journal Article Executive Summary
Student’s Name
Institutional Affiliation




This paper presents the technical journal article executive summary of the article titled,
“Real-time camera tracking using sports pitch markings,” authored by Graham Thomas. It
includes the purpose of the article, its focus, most effective arguments or pieces of evidence in
the article, and the primary conclusions drawn from it. The article reports the importance of
placing imaginary marks on the field to help in indicating the length between players and the
goal and also players found to be offside. Imaginary marks are placed in the field when
streaming live sporting activities such as soccer. The purpose of the article is to provide an
approach used to compute the real-time camera location, how it should be o...

flag Report DMCA

Excellent job

Similar Questions
Hot Questions
Related Tags
Study Guides

Brown University

1271 Tutors

California Institute of Technology

2131 Tutors

Carnegie Mellon University

982 Tutors

Columbia University

1256 Tutors

Dartmouth University

2113 Tutors

Emory University

2279 Tutors

Harvard University

599 Tutors

Massachusetts Institute of Technology

2319 Tutors

New York University

1645 Tutors

Notre Dam University

1911 Tutors

Oklahoma University

2122 Tutors

Pennsylvania State University

932 Tutors

Princeton University

1211 Tutors

Stanford University

983 Tutors

University of California

1282 Tutors

Oxford University

123 Tutors

Yale University

2325 Tutors