## Description

### Unformatted Attachment Preview

Purchase answer to see full attachment

Module Code
MAT00046M
MMath and MSc Examinations 2020/21
Department:
Mathematics
Title of Exam:
General Relativity
Time Allowed:
You have 24 hours from the release of this exam to upload your solutions.
Allocation of Marks:
The marking scheme shown on each question is indicative only.
Question:
Marks:
1
2
3
4
5
Total
20
10
10
20
40
100
Instructions for Candidates:
Answer all questions. It is important to show your work and reasoning in order to demonstrate
your knowledge and understanding.
Queries:
If you believe that there is an error on this exam paper, then please use the “Queries” link
below the exam on Moodle.
This will be available for the first hour after the release of this exam.
After that, if a question is unclear, then answer it as best you can and note the assumptions
you’ve made to allow you to proceed.
Submission:
Please write clearly and submit a single copy of your solution to each question. Any
handwritten work in your electronic submission must be legible. Black ink is recommended
for written answers. View your submission before uploading.
Number each page of your solutions consecutively. Write the exam title, your candidate
number, and the page number at the top of each page.
Upload your solutions to the “Exam submission” link below the exam on Moodle (preferably
as a single PDF file). If you are unable to do this, then email them to
maths-submit@york.ac.uk.
Page 1 (of 5)
MAT00046M
A Note on Academic Integrity
We are treating this online examination as a time-limited open assessment, and you are
therefore permitted to refer to written and online materials to aid you in your answers.
However, you must ensure that the work you submit is entirely your own, and for 28
hours after the exam is released, you must not:
• communicate with departmental staff on the topic of the assessment (except by means
of the query procedure detailed overleaf),
• communicate with other students on the topic of this assessment,
• seek assistance on this assessment from the academic and/or disability support
services, such as the Writing and Language Skills Centre, Maths Skills Centre and/or
Disability Services (unless you have been recommended an exam support worker in a
Student Support Plan),
• seek advice or contribution from any third party, including proofreaders, friends, or
family members.
We expect, and trust, that all our students will seek to maintain the integrity of the
assessment, and of their awards, through ensuring that these instructions are strictly
followed. Failure to adhere to these requirements will be considered a breach of the
Academic Misconduct regulations, where the offences of plagiarism, breach/cheating,
collusion and commissioning are relevant — see Section AM.1.2.1 of the Guide to
Assessment (note that this supersedes Section 7.3).
Page 2 (of 5)
MAT00046M
1 (of 5).
Let дµν be a Lorentzian metric (on a four-dimensional manifold) and u µ a unit
timelike vector field. Set
h µν = дµν − u µ uν
and for a generic (0,2) tensor A µν , define
1
1
B µν = ( (h µ λ hν η + hν λ h µ η ) − h µν h λη )A λη .
2
3
(a) Show that h µν satisfies the following identities:
h µν = hνµ ,
h µν u ν = 0,
h µν hνλ = h µ λ ,
h µ µ = 3.
[6]
(b) Using the identities from (a), show that B µν satisfies the following identities:
B µν = Bνµ ,
B µν u ν = 0,
B µ µ = 0.
[6]
(c) Show that
1
A µν = a u µ uν + b h µν + cν u µ + d µ uν + (h µ λ hν η − hν λ h µ η )A λη + B µν ,
2
where
a = A µν u µ u ν ,
2 (of 5).
1
b = A µν h µν ,
3
c µ = u ν h µ λ Aνλ ,
d µ = u λ h µ ν Aνλ .
[8]
Consider the two-dimensional surface in Euclidean space R3 parametrized by
angles θ, ϕ ∈ [0, 2π) such that
x = (R + r cos ϕ) cos θ,
y = (R + r cos ϕ) sin θ,
z = r sin ϕ
with respect to Cartesian coordinates x, y, z, and where R > r > 0 are constants.
Determine the (induced) metric of the surface in the (θ, ϕ) coordinates.
[10]
Page 3 (of 5)
Turn over
MAT00046M
3 (of 5).
Given a Lorentzian metric дµν and a scalar field Ω > 0, define a new metric by
д̃µν = Ω2 дµν .
The Christoffel symbols of the new metric are given by
Γ̃ µ νλ = Γ µ νλ + Ω−1 (δν Ω ,λ + δ λ Ω ,ν − дνλ д µη Ω ,η ).
µ
µ
(a) Show that a null vector with respect to дµν is also null with respect to д̃µν . [2]
(b) Show that a null geodesics with respect to дµν is also null geodesics with
respect to д̃µν . Hint: Reparametrize the geodesic using Ω−2 .
[8]
4 (of 5).
The energy tensor of a minimally coupled real scalar field ϕ with mass m ≥ 0 is
1
1
Tµν = ϕ∣µ ϕ∣ν − дµν ϕ∣λ ϕ∣λ + m2 ϕ2 дµν .
2
2
(a) Show that, provided that ϕ is not constant in any region of spacetime, energy
conservation implies the Klein-Gordon equation
0 = ◻ϕ + m2 ϕ,
where ◻ϕ ∶= ϕ∣µ µ .
[10]
(b) Show that the energy density observed by an observer with 4-velocity υ µ is
non-negative.
[10]
Hint: First explain how Sylvester’s theorem can be used to reduce the problem
(pointwise) to Minkowski spacetime, and why it is then enough to consider
υ µ = (1, 0, 0, 0).
Page 4 (of 5)
MAT00046M
5 (of 5).
Consider on {(x, y) ∈ R2 ∣ y > 0} for some a > 0 the metric
ds 2 =
a2
(dx 2 + dy 2 ).
y2
(a) Show that the geodesic equations are
2
0 = ẍ − ẋ ẏ,
y
1
0 = ÿ + (ẋ 2 − ẏ 2 ).
y
[8]
(b) Show that the only non-vanishing components of the Riemann tensor are
given by
1
[6]
R x yx y = −R x y yx = R y x yx = −R y xx y = − 2 .
y
(c) Calculate the Ricci tensor and the scalar curvature.
[2]
(d) Show that
ζ i = (x, y),
χ i = (y 2 − x 2 , −2x y)
(∗)
are Killing vector fields.
[6]
(e) Find a Killing vector which is linearly independent from the vectors (∗). [4]
(f) Show that, if ẋ ≠ 0, geodesics are given by semicircles
(x − x0 )2 + y 2 = r 2 ,
x0 ∈ R, r > 0.
[8]
Hint: One way to show this is to use one of the Killing vectors. Another way is
to use the equations of (a) and the chain rule to derive
dy 2
d2 (y 2 )
d2 y
2y 2 + 2( ) + 2 =
+2=0
dx
dx
dx 2
and then solving it.
(g) Show that the arc length of the geodesic x 2 + y 2 = r 2 between the points
√
√
p = (−c, r 2 − c 2 ), q = (c, r 2 − c 2 ), 0 < c < r,
is given by the integral
c
ar
∫−c r 2 − x 2 dx
and then evaluate the integral. Hint: To evaluate the integral, you can use
integration by partial fractions.
[6]
Page 5 (of 5)
End of examination.
SOLUTIONS: MAT00046M
1.
(a) Using the symmetry дµν = дνµ , we have
h µν = дµν − u µ uν = дνµ − uν u µ = hνµ .
Using the fact that u µ is unit timelike (u µ u µ = 1), we have
h µν u ν = дµν u ν − u µ uν u ν = u µ − u µ = 0.
Since this implies u ν hνλ = h λν u ν = 0, we find
h µν hνλ = д µν hνλ − u µ u ν hνλ = д µν hνλ = h µ λ .
Finally, using д µ µ = 4, as the spacetime is 4-dimensional, we get
h µ µ = д µ µ − u µ u µ = 4 − 1 = 3.
1 Mark
2 Marks
2 Marks
1 Mark
(b) We have
1
1
B µν = ( (h µ λ hν η + hν λ h µ η ) − h µν h λη )A λη = Bνµ ,
2
3
where in the first term we used the explicit symmetry in the indices µ and ν,
while in the second term we applied the first identity from (a).
2 Marks
Then we use the second identity from (a) in all terms to find
1
1
B µν u ν = ( (h µ λ hν η u ν + h µ η hν λ u ν ) − h µν u ν h λη )A λη = 0.
2
3
2 Marks
Finally, using the third identity from (a) to simplify the first term and the
fourth identity to simplify the second term, we get
1
1
B µ µ = ( (h µλ h µ η + h µ λ h µη − h µ µ h λη )A λη = (h λη − h λη )A λη = 0. 2 Marks
2
3
(c) Clearly, using the definition of h µν and expanding,
A µν = дµ λ дν η A λη
= (h µ λ + u µ u λ )(hν η + uν u η )A λη
= A λη u λ u η u µ uν + u λ hν η A λη u µ + u η h µ λ A λη uν + h µ λ hν η A λη ,
where we identify a in the first term, cν in the second, and d µ in the third. To
finish the calculation, we still need to show that
1
h µ λ hν η A λη = bh µν + (h µ λ hν η − hν λ h µ η )A λη + B µν .
2
Indeed, the right-hand side equals
1
1
RHS = h µν h λη A λη + (h µ λ hν η − h η λ h µ η )A λη
3
2
1
1
+ ( (h µ λ hν η + hν λ h µ η ) − h µνh λη )A λη
2
3
λ η
= h µ hν A λη
7
4 Marks
SOLUTIONS: MAT00046M
Total:
2.
4 Marks
thanks to two obvious cancellations.
20 Marks
We have
dx = −(R + r cos ϕ) sin θ dθ − r sin ϕ cos θ dϕ,
dy = (R + r cos ϕ) cos θ dθ − r sin ϕ sin θ dϕ,
dz = r cos ϕ dϕ.
5 Marks
Hence (using cos2 α + sin2 α = 1)
ds2 = dx 2 + dy2 + dz 2
= (R + r cos ϕ)2 (sin2 θ + cos2 θ) dθ 2 + r 2 sin2 ϕ (cos2 θ + sin2 θ) dϕ2
− 2(R + r cos ϕ)(r sin θ sin ϕ cos θ − r cos θ sin ϕ sin θ) dθ dϕ + r 2 cos2 ϕ dϕ2
= (R + r cos ϕ)2 dθ 2 + r 2 dϕ2
5 Marks
Total: 10 Marks
3.
(a) If n µ is a null vector with respect to дµν , then дµν n µ n ν = 0 and thus also
0 = Ω2 дµν n µ n ν = д̃µν n µ n ν .
That is, n µ is also null with respect to д̃µν .
2 Marks
(b) Consider a null geodesic with respect to дµν and affine parameter λ. We have
Γ̃ µ νη
dx ν dx η
dx µ dx ν ∂Ω
dx ν dx η ,µ
dx ν dx η
= Γ µ νη
+ Ω−1 (2
−
д
Ω )
νη
dλ dλ
dλ dλ
dλ dλ ∂x ν
dλ dλ
dx ν dx η
dx µ dΩ
= Γ µ νη
+ 2Ω−1
dλ dλ
dλ dλ
where we used the formula given in the question and the fact that the
geodesic is null.
3
Marks
Let τ be another parameter along the curve such that dλ/dτ = Ω−2 . Then
dx µ
dx µ dλ dx µ
=
= Ω−2
,
dτ
dτ dλ
dλ
2 µ
µ
d2 x µ dλ d dλ dx µ
−4 d x
−1 dΩ dx
=
(
)
=
Ω
(
−
2Ω
).
dτ dλ dτ dλ
dλ dλ
dτ 2
dλ2
8
2 Marks
SOLUTIONS: MAT00046M
Hence, combining the results above,
η
ν
d2 x µ
µ dx dx
+
Γ̃
νη
dτ dτ
dτ 2
dx µ dΩ
d2 x µ
dx ν dx η
= Ω−4 ( 2 − 2Ω−1
+ Γ̃ µ νη
)
dλ dλ
dλ dλ
dλ
dx ν dx η
d2 x µ
= Ω−4 ( 2 + Γ µ νη
)=0
dλ dλ
dλ
and we see that the curve is also a null geodesic with respect to д̃µν and
affine
parameter τ.
3 Marks
Total: 10 Marks
4.
(a) Conservation of the energy tensor implies
0 = T µν ∣ν = ϕ∣µ ν ϕ∣ν + ϕ∣µ ϕ∣ν ν − д µν ϕ∣λν ϕ∣λ + д µν ϕ∣ν m2 ϕ
= ϕ∣µ (ϕ∣ν ν + m2 ϕ) + ϕ∣µ ν ϕ∣ν − ϕ∣µ λ ϕ∣λ
= ϕ∣µ (ϕ∣ν ν + m2 ϕ),
where we use that two covariant derivatives of a scalar commute to simplify
the third term of the first line. The right-hand side vanishes for non-constant
ϕ only if ϕ µ µ + m2 ϕ = 0.
10 Marks
(b) By Sylvester’s theorem, given any point p there is a coordinate system such
that дµν (p) = η µν . Using Lorentz transformations, we can choose it such
that
µ
υ = (1, 0, 0, 0).
4 Marks
Hence, at any point, we can choose coordinates such that the energy density
as seen by the observer is
1
1
Tµν υ µ υν = (ϕ∣t )2 − ((ϕ∣t )2 − ∑(ϕ∣i )2 ) + m2 ϕ2
2
2
i
1
= ((ϕ∣t )2 + ∑(ϕ∣i )2 + m2 ϕ2 ) ≥ 0
2
i
because a sum of squares is positive.
6 Marks
Total: 20 Marks
9
SOLUTIONS: MAT00046M
5.
(a) To verify the statement, we compute the Christoffel symbols:
1
1
Γ y xx = − д y y дxx,y = ,
2
y
1
1
Γ x yx = Γ x x y = д xx дxx,y = − ,
2
y
1
1
Γ y y y = д y y дy y,y = − ,
2
y
whereas all other Christoffel symbols vanish.
Thus the geodesic equations are
4 Marks
2
0 = ẍ + 2Γ x x y ẋ ẏ = ẍ − ẋ ẏ,
y
1
0 = ÿ + Γ y xx ẋ 2 + Γ y y y ẏ 2 = ÿ + (ẋ 2 − ẏ 2 ).
y
4 Marks
(b) In two dimensions, the Riemann tensor has only one independent
component. The only non-zero components of R i jkl are R x yx y = R[x y][x y] .
Raising the first index using the diagonal metric, we see that the only
non-zero components are R x y[x y] = R y x[yx] .
3
Marks
It is thus sufficient to compute
R x yx y = Γ x y y,x − Γ x yx,y + Γ x x i Γ i y y − Γ x yi Γ i yx
1
= 0 − 2 + Γ x x y Γ y y y − (Γ x yx )2
y
1
1
= (−1 + 1 − 1) 2 = − 2 .
y
y
3 Marks
(c) A straightforward computation yields
R xx = R y y = −
1
y2
R = д xx R xx + д y y R y y = −
1 Mark
2
.
a2
(d) Killing’s equation is ξ i∣ j + ξ j∣i = 0. In components this yields the three
equations
1
0 = ξ x∣x = ξ x,x − Γ y xx ξ y = ξ x,x − ξ y ,
y
1
0 = ξ y∣y = ξ y,y − Γ y y y ξ y = ξ y,y + ξ y ,
y
2
0 = ξ x∣y + ξ y∣x = ξ x,y + ξ y,x − 2Γ x x y ξ x = ξ x,y + ξ y,x + ξ x .
y
10
1 Mark
SOLUTIONS: MAT00046M
Lowering the index of ζ i = (x, y), we obtain ζ i = a 2 ( yx2 , y1 ). Plugging this into
the equations above, we find
1
1
1
ζ x,x − ζ y = a 2 ( 2 − 2 ) = 0,
y
y
y
1
1
1
ζ y,y + ζ y = a 2 (− 2 + 2 ) = 0,
y
y
y
2
2x 2x
ζ x,y + ζ y,x + ζ x = a 2 (− 3 + 3 ) = 0.
y
y
y
Hence ζ i is a Killing vector.
3 Marks
x2
i
2
2
2
Lowering the index of χ = (y − x , −2x y), we obtain χ i = a (1 − y 2 , − 2xy ).
PLugging this into the component for of Killing’s equations, we find
χ x,x −
2x 2x
1
χ y = a2 (− 2 + 2 ) = 0,
y
y
y
χ y,y +
2x 2x
1
χ y = a2 ( 2 − 2 ) = 0,
y
y
y
χ x,y + χ y,x +
2x 2 2
x2
2
χ x = a2 ( 3 − + (1 − 2 )) = 0.
y
y
y y
y
Hence also χ i is a Killing vector.
3 Marks
(e) Evidently the metric does not depend on x and thus ξ i = (1, 0) is a Killing
vector. There are also no constants c1 , c2 such that ξ i = c1 ζ i + c2 χ i and
thus
this vector is linearly independent from the ones in (d).
4 Marks
(f) Here is one approach: By the chain rule, we have
d2 y 1 d ẏ
2 ẏ2
1
ẍ
1
=
=
(
ÿ
−
ẏ)
=
(
ÿ
−
),
ẋ
ẋ 2
y
dx 2 ẋ dλ ẋ ẋ 2
where we used the geodesic equation from (a) in the last step. Therefore,
y
d2 y
dy 2
2 ẏ2 ẏ 2
1
1
1
+
(
)
+
1
=
(
ÿ
−
+ + ẋ 2 ) = 2 ( ÿ + (ẋ 2 − ẏ 2 )) = 0,
2
2
dx
ẋ
y
y
ẋ
y
dx
where we used again the geodesic equation from (a).
Moreover,
dy
d2 y
dy 2
d2 2
d
y
+
2
=
2y
+
2
=
2y
+
2(
) + 2 = 0.
dx dx
dx
dx 2
dx 2
11
4 Marks
2 Marks
SOLUTIONS: MAT00046M
To see that the given semicircles solve this equation, note that
d2 2 d2 2
y = 2 (r − (x − x0 )2 ) = 2
dx 2
dx
for y 2 = r 2 − (x − x0 )2 .
2 Marks
(g) The arc length for the given curve C between p and q is
√
√
√
ẏ 2
a
a
д i j ẋ i ẋ j dλ = ∫
1 + ( ) ẋ dλ
l =∫
ẋ 2 + ẏ 2 dλ = ∫
ẋ
C
C y
C y
√
√
c a
c
c
ẏ 2
x2
a
ar
=∫
1 + ( ) dx = ∫ √
1+ 2
dx
=
dx
∫
2
2
2
2
ẋ
r −x
−c y
−c
−c r − x 2
r −x
a c 1
1
r+c
= ∫ (
+
) dx = a(ln(r + c) − ln(r − c)) = a ln(
).
2 −c r − x r + x
r−c
6 Marks
Total: 40 Marks
12
Module Code
MAT00046M
MMath and MSc Examinations 2019/20
Department:
Mathematics
Title of Exam:
General Relativity
Time Allowed:
2 hours
Allocation of Marks:
The marking scheme shown on each question is indicative only.
Question:
1
2
3
4
5
Total
Marks:
14
14
18
26
28
100
Instructions for Candidates:
Answer all questions.
Please write your answers in ink; pencil is acceptable for graphs and diagrams.
Do not use red ink.
Calculators are not allowed.
Materials Supplied:
Green booklet
Do not write on this booklet before the exam begins.
Do not turn over this page until instructed to do so by an invigilator.
Page 1 (of 4)
MAT00046M
Here is a list of some useful definitions and properties:
• Covariant derivative of a contravariant vector
υ µ ∣ν = υ µ ,ν + Γ µ νλ υ λ
• Christoffel symbols
1
Γ µ νλ = д µη (дηλ,ν + дνη,λ − дνλ,η )
2
• Riemann tensor
R µ νλη = Γ µ ην,λ − Γ µ λν,η + Γ µ λα Γ α ην − Γ µ ηα Γ α λν
• Symmetries of the Riemann tensor
R µνλη = −Rνµλη = −R µνηλ
R µ [νλη] = 0
and
• Ricci tensor and Ricci scalar
R µν = R λ µλν
and
R = д µν R µν
• Einstein tensor
1
G µν = R µν − Rдµν
2
• Einstein’s equation
G µν = 8πTµν
In n dimensions, the Riemann tensor has
n2 (n2 − 1)
independent components.
12
Page 2 (of 4)
MAT00046M
1 (of 5).
(a)
For an arbitrary tensor field f µν , write down the explicit formula for the
covariant derivative f µν∣λ in terms of f µν , the Christoffel symbols Γ λ µν and the
partial derivatives f µν,λ .
(b)
Use the result from (a) to show that for the metric tensor, дµν∣λ = 0.
[14]
2 (of 5).
Let дµν be a pseudo-Riemannian metric. The absolute value of the determinant of
the metric is denoted ∣д∣ and its partial derivatives are equal to ∣д∣,λ = ∣д∣д µν дµν,λ .
(a)
Show that Γ µ µλ = 21 д µν дµν,λ .
(b)
Explain how this implies that for any vector field υ i ,
1 √
υ λ ∣λ = √ ( ∣д∣ υ λ ),λ .
∣д∣
3 (of 5).
[14]
Consider the two-dimensional metric with coordinates (t, θ) defined by
dτ 2 = t −1 dt 2 − t dθ 2
for t > 0. For this metric,
(a)
compute all the nonzero independent components of the Christoffel
symbols Γ λ µν , and
(b)
show that all components of the Riemann tensor R ρ µνλ vanish.
[18]
Page 3 (of 4)
Turn over
MAT00046M
4 (of 5).
Consider the energy tensor for a gas of pure radiation:
4
1
T µν = ρ υ µ υν − ρ д µν ,
3
3
where ρ is a scalar and υ µ a timelike unit vector field.
(a)
Define what it means for υ µ to be a timelike unit vector field and show that
υ µ υ µ ∣ν = 0.
[6]
(b)
Assume that this T µν satisfies the energy conservation equation.
(c)
5 (of 5).
(i)
Show that
(ii)
Hint: Compute υ µ T µν ∣ν .
Using this, compute a formula for ρ υν υ µ ∣ν in terms of υ µ and the
derivative ρ∣µ .
4
0 = υν ρ∣ν + ρ υν ∣ν .
3
[14]
Assuming that Einstein’s equation is satisfied with this energy tensor, and
that spacetime is four-dimensional, compute the scalar curvature, R , and
Ricci tensor, R µν .
[6]
Consider the spherically symmetric metric,
dτ 2 = α dt 2 − α −1 dr 2 − r 2 (dθ 2 + sin2 θ dϕ2 ),
where α is a function of r .
(a)
For a timelike geodesic (parametrized by proper time, τ ) in the equatorial
plane (θ = π2 ) write down two constants of motion, based on the obvious
symmetries of the metric.
[8]
(b)
Use the constants to write a first order differential equation for r as a function
of τ .
[12]
(c)
Now specialize to the Schwarzschild spacetime with
2M
,
r
where M > 0 is a constant. Find a first order differential equation for u = r −1
as a function of ϕ.
[8]
α =1−
Page 4 (of 4)
End of examination.
SOLUTIONS: MAT00046M
1.
6
Marks
(a)
f µν∣λ = f µν,λ − f αν Γ α λµ − f µα Γ α λν .
(b)
дµν∣λ = дµν,λ − дαν Γ α λµ − дµα Γ α λν . By the definition of the Christoffel symbols,
1
дαν Γ α λµ = (дνµ,λ + дλν,µ − дλµ,ν ),
2
since the metric cancels with the inverse metric. This gives
1
1
дµν∣λ = дµν,λ − (дνµ,λ + дλν,µ − дλµ,ν ) − (дµν,λ + дλµ,ν − дλν,µ )
2
2
1
= (дµν,λ − дνµ,λ )
2
8 Marks
which is 0, because дµν = дνµ .
Total: 14 Marks
2.
(a)
(b)
By definition,
1
Γ ν µν = дνλ (дλν,µ + дµλ,ν − дµν,λ ).
2
The last 2 terms cancel because of the symmetry of the metric, so
Γ ν µν = 21 дνλ дνλ,µ .
Now,
6
Marks
√
1
1
( −∣д∣),µ .
Γ ν µν = ∣д∣−1 ∣д∣,µ = √
2
−∣д∣
This gives
1 √
1 √
υ µ ∣µ = υ µ ,µ + Γ ν µν υ µ = υ µ ,µ + √ ( ∣д∣),µ υ µ = √ ( ∣д∣ υ µ ),µ . 8 Marks
∣д∣
∣д∣
Total:
14
Marks
3.
(a)
The nonzero components of the metric are дtt = t −1 and дθθ = −t . The metric
only depends upon the t coordinate, so its nonzero derivatives are дtt,t = −t −2
and дθθ,t = −1.
The only nonzero components of Γλµν = 21 (дλν,µ + дµλ,ν − дµν,λ ) are those with
these combinations of index values. These are
1
Γttt = − t −2 ,
2
5
1
Γθtθ = Γθθt = − ,
2
1
Γtθθ = .
2
SOLUTIONS: MAT00046M
The nonzero components of the inverse metric are д tt = t and д θθ = −t −1 .
Applying the inverse metric gives the nonzero Christoffel symbols:
1
Γ t tt = − t −1 ,
2
(b)
1
Γ θ tθ = Γ θ θt = t −1 ,
2
12
Marks
1
Γ t θθ = t.
2
In two dimensions, the Riemann tensor has only one independent
component. Using (a),
R t θtθ = Γ t θθ,t − Γ t θt,θ + Γ t tt Γ t θθ − Γ t θθ Γ θ θt =
1 1
1
− 0 − − = 0.
2
4 4
6
Marks
Total:
18
Marks
4.
(a)
υ µ υ µ = 1. Differentiating this equation gives
0 = υ µ∣ν υ µ + υ µ υ µ ∣ν = 2υ µ υ µ ∣ν .
(b)
(i)
6
Marks
The energy conservation equation is
4
4
4
1
0 = T µν ∣ν = ρ∣ν υν υ µ + ρ υν ∣ν υ µ + ρ υν υ µ ∣ν − ρ∣µ .
3
3
3
3
From this (using υ µ υ µ = 1 and υ µ υ µ ∣ν = 0)
4
4
1
0 = υ µ T µν ∣ν = ρ∣ν υν + ρ υν ∣ν + 0 − υ µ ρ∣µ
3
3
3
4 ν
ν
= υ ρ∣ν + ρ υ ∣ν .
3
(ii)
7 Marks
Using this result,
4
1
4
0 = T µν ∣ν = ρ∣ν υν υ µ + (−υν ρ∣ν )υ µ + ρ υν υ µ ∣ν − ρ∣µ
3
3
3
1
4
1
= ρ∣ν υν υ µ + ρ υν υ µ ∣ν − ρ∣µ .
3
3
3
This gives that
(c)
1
1
ρ υν υ µ ∣ν = ρ∣µ − ρ∣ν υν υ µ .
4
4
Einstein’s equation is
1
32π
8π
R µν − Rдµν = 8πTµν =
ρυ µ υν − ρдµν .
2
3
3
Taking the trace of this gives
R − 2R = −R = 0,
6
7
Marks
SOLUTIONS: MAT00046M
so R = 0 and
R µν =
32π
8π
ρυ µ υν − ρдµν
3
3
6
Marks
Total:
26
Marks
5.
(a)
The components of the metric do not depend upon t or ϕ, therefore
ξ µ = (1, 0, 0, 0) and ζ µ = (0, 0, 0, 1) are Killing vectors, and
E = ξ µ ẋ µ = α ṫ
and
h = −ζ µ ẋ µ = r 2 ϕ̇
are constants of motion.
(b)
8
Marks
These constants of motion give the formulas ṫ = E α −1 and ϕ̇ = hr −2 . Because
the geodesic is parameterized by proper time, ẋ µ is a unit vector. This means
that
1 = дµν ẋ µ ẋ ν = α ṫ 2 − α −1 ṙ 2 − r 2 ϕ̇2
= E 2 α −1 − α −1 ṙ 2 − h 2 r −2 .
This is easily solved for ṙ 2 :
ṙ 2 = E 2 − (1 + h 2 r −2 )α.
(c)
12
Marks
Using u = r −1 and α = 1 − 2Mu ,
2
ṙ 2 ṙ 2 E 2 − (1 + h 2 r −2 )α
du
( ) = r −4 2 = 2 =
dϕ
h
h2
ϕ̇
E 2 − (1 + h 2 u 2 )(1 − 2Mu)
=
h2
= E 2 h −2 − (h −2 + u2 )(1 − 2Mu).
8
Marks
Total: 28 Marks
7
Hydrodynamic Stability
Qianhe Chai
(Dated: December 10, 2021)
1
History and Background
Hydrodynamic stability is a field that studies the stability and instability of fluid flow. The instability of fluid may further cause turbulence. The theoretical and experimental foundations of hydrodynamic stability were mainly laid by scientists such as Helmholtz, Kelvin, Rayleigh and Renault in the
nineteenth century.
In studying the dynamics of any physical system, the concept of stability makes sense only after
the possibility of equilibrium has first been determined. Once this step is taken, the concept of stability
becomes universal regardless of the actual system being probed.As stated by Beichev Criminale(1967),
stability can be defined as the immunity of a dynamical system to small disturbances([2]Criminale W
O , Jackson T L , Joslin R D, p.1). Obviously, the magnitude of the disturbance need not be small,
so it may be amplified. So you’re going to be out of equilibrium. If no equilibrium exists, then it can
be concluded that the particular system in question is statically unstable and its dynamics is an open
question. Such stability tests can and are being performed in any field, such as mechanics, astronomy,
electronics, and biology. In each case in this list, the common thread is that only a finite number of
discrete degrees of freedom are required to describe the motion, and only one independent variable is
required. Just as the problem can be tested in a continuum, but the number of degrees of freedom is
infinite, the governing equations are now partial differential equations rather than general variations
([2]Criminale W O , Jackson T L , Joslin R D, p.2). So it’s hard to draw conclusions in any general
way, but it’s not impossible. In fact, many such systems have been successfully analyzed, especially in
fluid mechanics.
2
Initial-Value Concepts and Stability Bases
At this stage, the time initial value and space boundary value problems arise and must be solved
to determine whether the given flow is unstable. In this respect, it is well defined, but, as we shall see,
there are many difficulties in actually carrying out this task. Of course, there is more than one definition
of stability, but the main question is whether the behavior of the disturbance causes irreversible changes
In the average flow. In short, if time starts from the initial moment and there is a return to the basic
state, then the flow is said to be stable. There instability can occur in a variety of ways, but the first
1
2
INITIAL-VALUE CONCEPTS AND STABILITY BASES
thing to understand is what are the methods available to resolve these problems so that any decision
can be made. It can be seen from the beginning whether the order of the system is higher than the
traditional mathematics of the second order boundary value problem Physics. Therefore, some classical
exploration methods are of limited value; Other possible uses need to be extended or modified in order
to work here.
Any velocity vector field can be decomposed into solenoids, rotating and harmonic components.For
the issue being discussed here, no Solenoid part due to fluid incompressibility and ∇ · u = 0. The
physical basis is that the rotational part of the velocity corresponds to the disturbance vorticity and
the harmonic part is related to the pressure.This analogy helps to explain physics better, and even
though the boundary casting conditions must be determined by velocity, the initial specification can be
considered vorticity.In this respect, every mean is referenced, when the governing equation is expressed
in terms of vorticity, the vorticity is essentially a quantity by which diffusion or advection is the original,
and the velocity profile is the result of that action.In the same way you can make inferences about
disturbed fields.
The relation
∂ṽ
∂x
is the equation of the perturbed pressure. Due to the interaction between the volatility and the average
∇2 p̃ = −2ρU 0
strain rate, there is an inhomogeneous term that is an effective source of the pressure. When neither
is subjected to tension, the pressure is harmonic. If velocity is not a solenoid, factors related to fluid
compressibility come into play. Now, the definitions of the perturbation vorticity components are
ω̃x =
ω̃y =
ω̃z =
∂ w̃
∂y
∂ ũ
∂z
∂ ṽ
∂x
−
−
−
∂ ṽ
∂z
∂ w̃
∂x
∂ ũ
∂y
respectively, since ω = ∇ × u. By using these definitions and the operation of the curl on the same set
of equations for the momenta, the following are obtained:
∂
∂t
∂
∂t
∂
∂t
∂
+ U ∂x
ω̃x − v∇2 ω̃x = −U 0 ∂∂xw̃ = Ωz ∂∂xw̃
∂
+ U ∂x
ω̃y − v∇2 ω̃y = −U 0 ∂∂zṽ = Ωz ∂∂zṽ ,
∂
+ U ∂x
ω̃z − v∇2 ω̃z = −U 0 ∂∂zw̃ + U 00 ṽ = Ωz ∂∂zw̃ − Ω0z ṽ,
where Ωz = −dU/dy is the single component of the mean vorticity and is in the z-direction.Each
equation has the expected average velocity and diffusion transport, but if there is a non-homogeneous
term, it is due to the interaction between wave strain and average vorticity. As in pressure relations,
these interactions are necessary for any generation of their respective wave components. However,
it is worth noting that such a generation This is because if there is neither w̃-component transverse
velocity and spatial correlation in the Z direction, because it will be a two-dimensional problem, then
the fluctuating vorticity component, except w̃z , can only flow and diffuse regardless of any initial input
([2]Criminale W O , Jackson T L , Joslin R D, p.12).
3
3
MILESTONE WORK
Milestone work
Orr (1907A,b) and Sommerfeld(1908) respectively proposed the problem of viscous stability. They
both attempted to study channel flow, Orr considering planar Couette flow and Sommerfeld considering
planar Poiseuille flow. Of course, one case is the limit of the other, and the combination of the two gives
rise to the Orr Sommerfeld equation, which has become an important basis for hydrodynamic stability
theory. But, even here, it should be remembered that it was not until 22 years after the equation
was derived that any solution was available. Tollmien(1929) calculated the first neutral eigenvalue of
planar Poiseuille flow, indicating that there is a critical Reynolds number. This work was due to the
development of the Tietjens function (Tietjens, 1925) and the analysis of Heisenberg (1924), related
to the topic of resistance instability. Romanov(1973) proved theoretically that the planar Kuet flow
is stable. Unlike pipe flow, there is no experimental controversy. Planar Poiseuille flow, on the other
hand, is unstable.
Schlitting (1932A, B, 1933A, B, C, 1934, 1935) continued tolmien’s work and extended it further.The combination of these efforts has led to the remarkable result of what is now called the stability
of oscillations parallel or nearly parallel flows, the Tormion-Chelichtyn wave.It should be noted that
these waves correspond to those with frictional waves that do not exist anything except viscosity and
flow is known to exist only when solid boundaries exist in the fluid.In the limit of infinite Reynolds
number, the flow tends to be stable.
Planter (1921-1926,1930,1935) was very active in issues related to stability and hoped that this
theory would lead us to predict transitions and turbulences. As noted earlier, such success has not been
achieved so far but efforts continue as understanding advances. But, in the first place during this period,
the main impetus for stability analysis was the work of Taylor (1923), whose theory was confirmed by
his experiments in the case of rotating concentric cylinders. Taylor himself was responsible for this
and this work went on to become a model for understanding the stability of the mean flow in curved
streamlines.
The emergence analysis of matched asymptotic expansion and singular perturbation brings new
vitality to the theory. Lin (1944,1945) took advantage of these and redid all his previous calculations,
thus confirming that the results had been obtained by less sophisticated means. Experiments were
also obtained with the momentum of the work of Schubauer and Skramstad(1943) in investigating the
setting of a standard for the flat boundary layer. Here, a vibration with a ribbon simulates a controlled
disturbance, the Tolman disturbance splitting wave, at the boundary.This method is still used by many
people
Studies of the stability of compressible flows were not completed until much later, landau (1944),
Lees (1947) and Dunn and Lin (1955) were major contributors at the time. Physics and mathematics,
this is a much more complex problem, which is understandable given its time span to solve this theory
in an incompressible medium. A wide range of issues are examined here, including different prototypes
and Mach number to hypersonic values.
The nonlinearity in stability theory is evaluated. Meksyn and Stuart (1951), Benney(1961, 1964)
and Eckhaus (1962a, B, 1963, 1965) were all, respectively, early contributors to what is now known
REFERENCES
as weak nonlinear theory. Each effort focuses on a different aspect of the problem. For example, the
development of longitudinal or streamwise vortices in nonlinear critical layers interferes with detection
of the possibility of boundary layer and limit amplitude amplification. The role of downstream vorticity
in breakdown from laminar to turbulent has recently been explored using a complete approach to the
N-S equation. To this end, Fasel (1990), Fasel and Thumm(1991), Schmid and Henningson (1992a,b)
and Joslin, Streett and Chang (1993) have introduced values with amplituals ranging from very small
to finite. This interaction of oblique waves results in the dominant streamwise vortex structure. When
the amplitude of the wave is very small, the disturbance first amplifies and then decays somewhere
downstream.
4
Latest Research and It’s Application
A traffic grid hydrodynamic model considering density integral effect is proposed and analyzed.
Based on the linear stability theory, the linear stability conditions of the new model are derived, and the
improvement of traffic stability considering the continuous historical density information is revealed.
Through nonlinear analysis, the nonlinear properties of the extended model are revealed. The mKdV
equation near the critical point is derived, and the reverse kink anti kink wave is obtained, which
is verified by numerical simulation. The results show that the density integral effect can effectively
suppress traffic congestion in the traffic grid hydrodynamic model([1]He Y C , Zhang G , Chen D).
At present, road traffic problems, especially traffic congestion, have become more and more serious.
A lot of research has been done to understand the rules of traffic congestion. The existing traffic models
can be roughly divided into micro traffic models and macro traffic models, both of which can reproduce
many real traffic phenomena.
The nonlinear analysis of the new model near the critical point is carried out to obtain the traffic
evolution characteristics. The theoretical analysis results show that considering the density integral
effect can expand the stable region of traffic flow, and solving the mKdV equation can describe the
unstable traffic flow. Finally, numerical simulation is carried out to further prove that the density
integral effect is very helpful to improve traffic stability and curb traffic congestion([1]He Y C , Zhang
G , Chen D).
References
[1] He Y C , Zhang G , Chen D . Effect of density integration on the stability of a new lattice
hydrodynamic model[J]. International Journal of Modern Physics B, 2019.
[2] Criminale W O , Jackson T L , Joslin R D . Theory and Computation in Hydrodynamic Stability[M]. 2018.
General Relativity
Christopher J Fewster∗1
1 Department
of Mathematics, University of York, Heslington, York YO10 5DD, United Kingdom.
November 29, 2021
Contents
1
Introduction
1
2
Smooth spaces and manifolds
3
3
Tensors on smooth manifolds
3.1 Vectors . . . . . . . . . . . . . .
3.2 Covectors . . . . . . . . . . . . .
3.3 Tensors . . . . . . . . . . . . . .
3.4 Algebraic operations with tensors .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
7
11
12
14
4
Metric tensors
17
5
Geodesics
5.1 Nonexaminable: Proof of the theorem . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Status check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
26
28
6
Differentiation of tensors
6.1 Derivative operators . . . . . . . . . . .
6.2 Comparison of derivative operators . . .
6.3 Levi–Civita derivative operator . . . . .
6.4 Geodesics encore . . . . . . . . . . . .
6.5 Killing vectors . . . . . . . . . . . . . .
6.6 Application: cosmological red shift . .
6.7 Fermi coordinates and minimal coupling
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
29
29
33
34
36
38
39
40
Curvature
7.1 The Riemann tensor . . . . . . . .
7.2 Symmetries of the Riemann tensor
7.3 Ricci tensor and scalar . . . . . .
7.4 Visualising curvature . . . . . . .
7.5 The second Bianchi identity . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
41
41
43
44
44
45
7
∗ chris.fewster@york.ac.uk
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
8
9
The Einstein field equations
8.1 The energy-momentum tensor . . . . . . . . . . . . . . .
8.2 Geodesic deviation . . . . . . . . . . . . . . . . . . . . .
8.3 Absolute and relative acceleration . . . . . . . . . . . . .
8.4 General Relativity as a generalisation of Newtonian gravity
.
.
.
.
46
46
47
49
50
The Schwarzschild solution
9.1 The Schwarzschild metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.2 Orbits in Schwarzschild . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.3 The Schwarzschild black hole . . . . . . . . . . . . . . . . . . . . . . . . . . . .
52
52
53
57
10 Cosmology
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
59
1
Introduction
Relativity? I have never been able to understand what that word means in this connection. I used to think that this was my fault, some flaw in my intelligence, but it is now
apparent that nobody ever understood it, probably not even Einstein himself. So let it
go. What is before us is Einstein’s theory of gravitation.1
Einstein began to develop a theory of gravity as early as 1907, when he was still working at the
Patent Office in Bern. He was guided by various physical heuristics, in particular the equivalence
principle and general covariance. Roughly (the opening quote shows that these ideas are perhaps
less clear-cut than they appear) the equivalence principle states that
• a freely falling observer does not experience gravity, but feels themself to be weightless
• an accelerating observer (in the absence of gravity) cannot distinguish their experiences from
what would happen if they were at rest in a suitable gravitational field
while general covariance is the idea that the laws of physics should take the same essential form in
any system of coordinates. By comparison, Einstein’s 1905 theory of Special Relativity (SR) does
not describe gravitation and only makes the assumption that the laws of physics appear the same
when written in inertial frames of reference. In General Relativity (GR) a freely falling observer is
not accelerating. It is the non-freely falling observers (e.g., sitting in a chair) who are accelerating.
The path from Einstein’s startling realisation of the equivalence principle (sitting in a chair in
the patent office!) to the eventual form of GR in 1915 was long and tortuous. Einstein made many
conceptual and calculational errors along the way, and at times rejected some elements of what
eventually became the bedrock of the theory. This further illustrates the point that the heuristics are
not easily turned into a working theory of gravity. The approach we will take is to present Einstein’s
theory as it turned out, and indicate how it reflects the equivalence principle and general covariance,
rather than trying to derive the theory from them. The remarkable success of General Relativity as
a physical theory has been emphasised in the last few years by the award of two Nobel Prizes in
Physics2
2017 to Rainer Weiss, Kip Thorne and Barry Barish, “for decisive contributions to the LIGO
detector and the observation of gravitational waves”
2020 to Roger Penrose “for the discovery that black hole formation is a robust prediction of the
general theory of relativity” and Reinhard Genzel & Andrea Ghez: “for the discovery of a
supermassive compact object at the centre of our galaxy”.
Books There are many excellent books on GR (with varying conventions!). In particular I mention:
• MA Ludvigsen, General Relativity: A geometric approach (CUP, 1999) – relatively short and
similar in spirit to the approach we will take
• SM Carroll, Spacetime and Geometry: An introduction to general relativity (CUP, 1999) – a
comprehensive modern treatment at advanced undergraduate/beginning research level
• RM Wald, General Relativity (University of Chicago, 1984) – a highly influential text,
particularly recommended for those thinking of taking GR further.
1JL Synge, Relativity: The General Theory (1960)
2Einstein won his own Nobel Prize in 1921 for his work on the photoelectric effect; not for either SR or GR.
1 (of 63)
CJF, Autumn 2021-22
High level summary General Relativity is a profound physical theory, but it is also highly
mathematical, making use of differential geometry and particularly tensor calculus. This will take
some time to develop but is essential to a proper understanding of the theory.
To give a collection of signposts, here is a high-level description of GR – which should become
more understandable as we proceed through the course. Firstly, the mathematical model of spacetime
is the following:
• Spacetime is a 4-dimensional smooth manifold whose points correspond to the location of
potential events
• Spacetime is equipped with a Lorentzian metric g µν that determines distance, angle and
causal relations
• The metric determines a natural derivative operator on tensor fields, and its curvature can
be described by the Riemann tensor and other tensors derived from it.
Turning to the physical content, the idea is that gravity is entirely described by the spacetime metric
and particularly its curvature.
• The worldlines of freely falling observers are timelike geodesics of the metric
• Light rays follow null geodesics of the metric.
• General matter distributions are described by a energy-momentum tensor Tµν
• The metric and the stress-energy tensor are linked by the Einstein equations
G µν − Λg µν = 8πG N Tµν
where the Einstein tensor G µν is determined by the spacetime curvature, G N is Newton’s
constant of gravitation and Λ is the cosmological constant.
The first pair of items were summarised by JA Wheeler in the phrase ‘spacetime tells matter how to
move’ and the second pair as ‘matter tells spacetime how to curve’.
Units and conventions We will almost always adopt units in which the speed of light is unity,
c = 1. For example, if time is measured in seconds, then distance is measured in light seconds (one
light second being exactly 299, 792, 458m). This removes annoying factors in many expressions.
In some situations we might also set G N = 1 by a further choice of units. General Relativity has
a number of objects that are defined with different sign conventions by different authors. For easy
reference, the main conventions in this course are:
• the metric is ‘mostly minus’ in signature, e.g., the metric of special relativity is η =
diag(+1, −1, −1, −1), so that gαβ v α v β > 0 for any timelike vector v.
• the Riemann tensor (when we get to it) is defined so that (∇α ∇ β − ∇ β ∇α )u µ = R
• the Ricci tensor is defined by Rν β = R
µ
ν
ναβ u
µ
ν µβ .
When reading books or papers on GR, one should bear in mind that conventions may be different,
which means that some signs may appear . The conventions in this module match those in Quantum
Field Theory (regarding the metric signature) and Advanced General Relativity.
2 (of 63)
CJF, Autumn 2021-22
2
Smooth spaces and manifolds
An n-dimensional manifold is a set with structures that allow us to do calculus, and whose local
structure can be described by pieces of Rn . Our approach is slightly nonstandard (but equivalent
to the usual one) and focusses attention on the basic objects of multi-dimensional calculus, namely
smooth curves and smooth functions. When applied to GR, particles and observers will follow
smooth curves in spacetime, and smooth functions might represent physical quantities such as the
energy density seen by a family of observers.
t
c(t)
c
O
f
p
R
f (p)
R
M
We begin by defining a smooth space as a set M together with a subset C of the functions from
R to M, and a subset F of the functions from M to R, and a set O of subsets of M. Any c ∈ C
is called a smooth curve, any f ∈ F is called a smooth function and any O ∈ O is called an open
set. The composition f ◦ c of smooth function f and smooth curve c is a map from R to R. These
objects are required to obey some intuitive conditions:
• If c ∈ C and f ∈ F then f ◦ c ∈ C ∞ (R), i.e., it has derivatives of all orders
• If f : M → R is any function such that f ◦ c ∈ C ∞ (R) for all c ∈ C then f ∈ F .
• If c : R → M is any function such that f ◦ c ∈ C ∞ (R) for all f ∈ F then c ∈ C.
• For every pair of distinct points p, q ∈ M, p , q, there is a function f ∈ F that distinguishes
them, f (p) , f (q), and a curve c ∈ C that joins them p, q ∈ Im c.
f (q)
q
p
M
c
f
f (p)
R
• Any open set is a union of sets of the form {p ∈ M : f (p) , 0} for f ∈ F – every open set
can be built using functions in F .
• If O is open and c ∈ C passes through a point c(t) ∈ O, for some t ∈ R, then c(s) ∈ O for all
s sufficiently close to t – curves can’t suddenly jump out of an open set.
It is not necessary to remember all of this, and not much of it will be very visible in the course.
The important points are that smooth spaces have good notions of smooth functions and smooth
curves, that ‘smoothness’ is based on conventional differentiability of real-valued functions on R.
Note that:
3 (of 63)
CJF, Autumn 2021-22
• Constant functions ( f (p) = a for all p ∈ M) and constant curves (c(t) = p for all t ∈ R)
always belong to F and C respectively. For if f is constant then so is f ◦ c for any c ∈ C,
and therefore f ◦ c ∈ C ∞ (R) for all c ∈ C; hence f ∈ F . Exercise: Write out the analogous
argument for constant curves.
• Any smooth space may be reconstructed from just M plus either of F or C.
Example 1. Here are some examples – we won’t concern ourselves with checking all of the details.
1. M = Rn is a smooth space with C = C ∞ (R, Rn ), F = C ∞ (Rn, R), and O consisting of the usual
open subsets of Rn , i.e., those O ⊆ Rn such that every point p ∈ O is surrounded by a ball of
nonzero radius that is also contained in O.
2. If O ⊆ Rn is open and connected (i.e., every two points in O are joined by a continuous curve)
then O together with all CO = C ∞ (R, O) defines a smooth space, in which FO = C ∞ (O, R).
[All the smooth curves in CO are also smooth curves in Rn , and all smooth functions on
Rn define corresponding functions in FO by restriction to O. However functions in FO may
become unbounded as they approach the boundary of O – these functions cannot be obtained
by restricting a smooth function on Rn . Meanwhile, it turns out that OO = {P ∩ O : P ∈ O}.]
3. In fact, the same applies whenever O is an open subset of a smooth space M: O becomes a
smooth space with curves CO = {c ∈ C : Im c ⊆ O}.
4. If ϕ ∈ C ∞ (Rn, R), let M be the level set
M = {x ∈ Rn : ϕ(x) = 0}.
If ∇ϕ = (∂ϕ/∂ x 1, . . . , ∂ϕ/∂ x n ) is never zero on M then M defines a smooth space with
C = {c ∈ C ∞ (R, Rn ) : Im c ⊆ M }. In particular, this shows that spheres of nonzero radius
are smooth spaces, using e.g., ϕ(x) = k xk 2 − a2 for constant a > 0.
Definition 2. For smooth spaces M and N, C ∞ (M, N) denotes the set of of all functions f : M → N
such that f ◦ c ∈ CN for all c ∈ CM , or, equivalently, g ◦ f ∈ FM for all g ∈ FN . In particular, we
may now write FM = C ∞ (M, R), CM = C ∞ (R, M).
We now know what is meant by C ∞ (Rn, M) and C ∞ (M, Rn ) for any smooth space M and likewise
for C ∞ (O, M), C ∞ (M, O) for open O ⊆ Rn , and e.g., for C ∞ (S2, S3 ) where Sn is the unit sphere in
Rn+1 . Composition of smooth functions works as you would hope:
Lemma 3. If g ∈ C ∞ (L, M) and f ∈ C ∞ (M, N) then f ◦ g ∈ C ∞ (L, N).
Proof. Let c be a smooth curve in L. Then ( f ◦ g) ◦ c = f ◦ (g ◦ c). Because g ∈ C ∞ (L, M), g ◦ c is a
smooth curve in M; hence, f ◦ (g ◦ c) is a smooth curve in N because f ∈ C ∞ (M, N). As ( f ◦ g) ◦ c
is a smooth curve in N for every smooth curve c in L, we have proved that f ◦ g ∈ C ∞ (L, N).
Manifolds are smooth spaces obeying extra conditions relating to coordinate charts.
Definition 4. If O is an open subset of M and X ∈ C ∞ (O, Rn ) is such that
• X(O) := {X(p) : p ∈ O} is an open subset of Rn
• X has a smooth inverse, X −1 ∈ C ∞ (X(O), O)
then we say that X defines an n-dimensional chart with chart domain O and chart image X(O).
Writing X(p) = (X 1 (p), . . . , X n (p)) defines smooth coordinate functions X 1, . . . , X n on O.
4 (of 63)
CJF, Autumn 2021-22
X(O)
p
O
M
X
X(p)
Rn
Often we will label coordinates from 0 to n − 1, particularly when discussing spacetimes.
Overlapping charts are automatically compatible: if O1 and O2 are overlapping n-dimensional
chart domains with coordinate maps X and Y respectively, then
Y ◦ X −1 ∈ C ∞ (X(O1 ∩ O2 ),Y (O1 ∩ O2 ))
and
X ◦ Y −1 ∈ C ∞ (Y (O1 ∩ O2 ), X(O1 ∩ O2 ))
because compositions of smooth functions are smooth.
Definition 5. Let n ∈ N. An n-dimensional smooth manifold (or just n-manifold for short) is a
smooth space, each of whose points is contained in an n-dimensional chart domain, and which
admits ‘partitions of unity’. Any collection of charts whose domains cover the manifold is called an
atlas.
Rn , open subsets of Rn and the unit n-sphere are all examples of n-dimensional manifolds.
The existence of ‘partitions of unity’ means, roughly, that smooth functions can be decomposed
into pieces which live inside individual chart domains in an atlas. The precise definition is technical
and nonexaminable (see below). The main message of the definition is that any manifold can be
covered by an atlas of chart domains of fixed dimension. Typically more than one chart is needed
for an atlas, e.g., the 2-sphere S2 needs a minimum of two.
Example 6. A typical point on S2 is represented by (x, y, z) ∈ R3 with x 2 + y 2 + z2 = 1. Let
O = S2 \ {(0, 0, −1)}, i.e., we remove the south pole. Then O is the chart domain for the stereographic
projection map
2y
2x
,
S(x, y, z) =
z+1 z+1
with chart image S(O) = R2 . Geometrically, S(p) tells us where the straight line from the south pole
through p intersects the plane z = 1. (The south pole would be mapped to infinity, which is why we
have to remove it.) By constructing a similar map based on removing the north pole we obtain two
2-dimensional charts that together cover S2 .
If c : R → R3 , c(t) = (x(t), y(t), z(t)) is any smooth curve with Im c ⊆ O, then S ◦ c : R → R is
smooth, because c avoids the south pole, where z = −1. Thus S is smooth. It is an exercise to show
that the inverse S −1 : R2 → S2
4X
4Y
4 − X2 − Y 2
−1
,
,
S (X,Y ) =
4 + X2 + Y 2 4 + X2 + Y 2 4 + X2 + Y 2
is also smooth.
Example 7. Plane polar coordinates (r, θ) on R2 are defined so that x = r cos θ, y = r sin θ
with 0 ≤ θ < 2π. They do not provide a coordinate chart on all of R2 because they map R2 to
{(r, θ) : r ≥ 0, 0 ≤ θ < 2π} = [0, ∞) × [0, 2π), which is not an open subset of R2 . Also, angle θ is
discontinuous at any point of the positive x-axis [0, ∞) × {0}. However, plane polar coordinates do
provide a coordinate chart on the open set R2 \ {(x, 0) : x ≥ 0}. Changing the allowed range of θ
to [−π, π), we can move the problematic direction to the negative x-axis, producing another chart.
Together these charts form an atlas for R2 \ {(0, 0)}.
5 (of 63)
CJF, Autumn 2021-22
Nonexaminable: Partitions of unity A smooth space admits partitions of unity if the following
condition holds. For every family of open sets Uα so that M = ∪α Uα there is a subordinate partition
of unity, namely a possibly infinite collection of functions ψα ∈ F so that
• 0 ≤ ψα ≤ 1 for all α
• supp ψα ⊂ Uα for all α, where supp f denotes the support of a function f , defined as the
closure of the set of points where f is nonzero
• to every p ∈ M there is an open set O containing p that intersects at most finitely many of the
supp ψα ’s
Í
• α ψα = 1 (NB there is no problem with convergence because at most finitely many terms in
the sum are nonzero at any p ∈ M.)
Nonexaminable: Relation to other definitions The more standard way of defining a manifold is
as a topological space M, covered by coordinate charts that are in the first instance only assumed to
be continuous with continuous inverses. One then defines two chart maps X and Y with overlapping
chart domains to be smoothly compatible if X ◦ Y −1 and Y ◦ X −1 are smooth in a manner similar
to our definition of compatibility above. It is required that the manifold can be covered by a set
of smoothly compatible charts; any such covering is called an atlas. Then one defines an atlas to
be maximal if it contains all charts that are smoothly compatible with its elements. A topological
space M with a maximal atlas of n-dimensional charts is called a smooth n-manifold. It is usual to
require additionally that the topology of M is Hausdorff and ‘second countable’.
At this point one can start discussing functions and curves. A function f : M → R is smooth
at p ∈ M if in some (hence any) coordinate chart X : O → Rn with p ∈ O, the function
f ◦ X −1 : X(O) → Rn is smooth at X(p), and one says that f is smooth if it is smooth at every point
of M; a function c : R → M is smooth at t ∈ R if there is a chart domain O containing c(t) and
with coordinate map X : O → Rn such that X ◦ c : R → Rn is smooth at t. The open sets are those
determined by the topology of M. It turns out to be a consequence of second countability and the
existence of a maximal atlas that M admits (smooth) partitions of unity.
It is a nontrivial theorem that every connected n-dimensional manifold in this sense is a ndimensional manifold in our sense and vice versa. However the advantage of our approach is that
the interesting objects – smooth functions and curves – are placed at the centre of things, and the
various conditions stated are fairly intuitive. Moreover, the compatibility of overlapping charts is
automatic and one avoids needing to talk about maximal atlases. It also avoids having to know
about topological spaces, including the nontrivial definition of second countability, from the start.
For those who want to read more, most books on GR adopt the standard approach to defining
manifolds with more or less rigour. Ludvigsen’s book uses smooth spaces. Our definition of a
smooth space corresponds to what is known as a Frölicher space that is additionally balanced,
Hausdorff and smoothly path connected. However the literature on Frölicher spaces is somewhat
technical!
6 (of 63)
CJF, Autumn 2021-22
3
Tensors on smooth manifolds
General relativity is expressed in terms of tensor fields on smooth manifolds. The use of tensors is
motivated by the idea of ‘general covariance’ – that it should be possible to write the laws of physics
in any coordinate system in ‘the same way’. As we will see, everything can be built up from the
smooth functions and smooth curves on a smooth n-manifold M. This requires a new viewpoint on
what a vector is.
3.1
Vectors
If f : M → R is a smooth function and c : R → M is a smooth curve then f ◦ c is a smooth
function on R, which can be differentiated. For example, if c(t) is the position of an observer at time
t according to her watch, and f (p) is the temperature at point p ∈ M, then ( f ◦ c)(t) is the observed
temperature at time t, and the derivative is the rate at which the observed temperature varies. At
any specific point p = c(t) along c there is now a map from functions to their derivative along c at
p. Let us write this map as v : F → R,
v( f ) = ( f ◦ c)0(t).
Its main properties are linearity:
v(λ f + µg) = λv( f ) + µv(g),
λ, µ ∈ R, f , g ∈ F
and a Leibniz rule:
v( f g) = (( f g) ◦ c)0(t)
d
d
d
f (c(s))g(c(s))
f (c(s))
= f (c(t)) g(c(s))
+ g(c(t))
=
ds
ds
ds
s=t
s=t
s=t
0
0
= f (p)(g ◦ c) (t) + g(p)( f ◦ c) (t)
= f (p)v(g) + g(p)v( f )
for f , g ∈ F . The map v is called the tangent vector to c at p by analogy with the situation for a
curve in R3 , written in vector notation as c(t), where
v( f ) =
d
f (c(s))
= cÛ (t) · ∇ f | c(t)
ds
s=t
for f ∈ C ∞ (R3, R) and the connection between v and cÛ (t) is clear.
We now use the properties of tangent vectors to inspire the general definition of a vector at a
point: it is something that ‘eats’ a function and gives back a number, with the same properties as
tangent vectors. A vector field supplies a vector at every point of M, so it ‘eats’ a function and gives
back another function.
Definition 8. A vector at p ∈ M is a linear map v : F → R obeying a Leibniz rule
v( f g) = f (p)v(g) + g(p)v( f ).
(The clue that v ‘lives’ at p is that the coefficients f (p) and g(p) are evaluated there.) A vector field
on M is a linear map v : F → F (thus v( f ) ∈ F for each f ∈ F ) so that
v( f g) = f v(g) + gv( f ),
i.e.,
v( f g)(p) = f (p)v(g)(p) + g(p)v( f )(p)
7 (of 63)
∀p ∈ M.
CJF, Autumn 2021-22
Note that a vector at p cannot be a vector at q , p: it satisifes a different Leibniz rule.
Example 9. The Leibniz rule has some useful consequences. For example, let f ≡ 1. Then f 2 = f
and by Leibniz
v( f ) = v( f 2 ) = f (p)v( f ) + f (p)v( f ) = 2v( f ),
so it must be that v( f ) = 0 for all constant functions f . Also note that if f and g both vanish at p,
then v( f g) = f (p)v(g) + g(p)v( f ) = 0.
Now suppose that X 1, . . . , X n are coordinates with chart domain O. We can use the coordinates
to produce n vector fields on O, denoted ∂αX . To do this, observe that any smooth function f can be
written as
f (p) = F(X 1 (p), . . . , X n (p)) = (F ◦ X)(p)
for some smooth function F on the chart image of O. More symbolically, F = f ◦ X −1 ∈
C ∞ (X(O), R). Then we define
∂F
.
∂αX ( f )(p) =
∂ x α X(p)
It is an exercise to check that each ∂αX is a vector field according to the definition, and that
(
1 β=α
β
∂αX (X β ) = δ α :=
0 otherwise.
(Hint: X β ◦ X −1 (x 1, . . . , x n ) = x β .)
The main facts about vectors are given by:
Theorem 10. On an n-dimensional manifold M:
(a) The set of vectors at any q ∈ M is a real vector space, denoted Vq . In particular, if v, w ∈ Vq
and a, b ∈ R, then z = av + bw ∈ Vq is defined by
z( f ) = av( f ) + bw( f )
for all f ∈ F .
(b) If v ∈ Vq and X 1, . . . , X n are coordinates on a chart domain containing q, then
v( f ) = v α ∂αX ( f )(q)
where
v α = v(X α ).
(Here, we sum the repeated index α from 1 to n.) Equivalently, v = v α ∂αX |q , where the vectors
∂αX |q ∈ Vq , defined by ∂αX |q ( f ) = ∂αX ( f )(q), form a basis for Vq , which therefore has dimension n.
Proof. (Nonexaminable – included for those with background in linear algebra and analysis.)
(a) Defining z as above, it is easily seen that z : F → R is linear. Moreover we can easily check
the Leibniz rule
z( f g) = av( f g)+bw( f g) = a( f (p)v(g)+g(p)v( f ))+b( f (p)w(g)+g(p)w( f )) = f (p)z(g)+g(p)z( f )
so that the set of vectors is indeed a real vector space.
(b) We use Taylor’s theorem with remainder, which tells us that for any fixed (y 1, . . . , y n ) in the
chart image
F(x 1, . . . x n ) = F(y 1, . . . , y n ) + (x α − y α )
∂F
∂ xα
(y 1,...,y n )
8 (of 63)
+ (x α − y α )(x β − y β )Rαβ (x 1, . . . , x n )
CJF, Autumn 2021-22
where the n × n functions Rαβ (which depend on the fixed point y = (y 1, . . . , y n )) are smooth and
we sum the repeated indices from 1 to n. Thus, for fixed q ∈ O but varying p ∈ O,
f (p) = f (q) + (X α (p) − X α (q))∂αX ( f )(q) + (X α (p) − X α (q))(X β (p) − X β (q))Rαβ (X(p)).
If v is any vector at q, we may now use linearity and Example 9 to compute v( f ). Note that f (q)
and X α (q)∂αX ( f )(q) are constant w.r.t. p and therefore killed by v. The last term is a sum of products
of functions that vanish where p = q and therefore also killed by v. Therefore,
v( f ) = v(X α ∂αX ( f )(q)) = ∂αX ( f )(q)v(X α )
as required. We see that v = v α ∂αX |q , so the n vectors ∂αX |q span the space of vectors at q. They are
also linearly independent (exercise) and so form a basis for Vq .
We call the n numbers v α = v(X α ) the components of v in the coordinates X α . The components
of a vector field determine n smooth functions on the chart domain of any coordinate system.
Example 11. Suppose v is the tangent vector to curve c at p = c(t). Then the components of v in
the coordinates X α are
d α
X (c(s)) .
ds
s=t
v α = v(X α ) = (X α ◦ c)0(t) =
That is, the components of the tangent vector at c(t) are the derivatives of the coordinates of the
point c(s) on the curve at s = t.
Coordinates provide a way of translating abstract vectors into concrete sets of numerical components. What those numbers are depends on the coordinate system chosen. We can see how they
change from one coordinate system to another: let Y α be an alternative coordinate system with
corresponding vector fields ∂αY . Any f ∈ F can be expressed in the Y coordinate system as
f (p) = G(Y 1 (p), . . . ,Y n (p)) = F(ϕ(Y 1 (p), . . . ,Y n (p)))
where ϕ = X ◦ Y −1 is a smooth map between the two chart images that translates between the two
sets of coordinates so that
X(p) = ϕ(Y (p)).
It is convenient to write
ϕ(y 1, . . . , y n ) = (x 1 (y 1, . . . , y n ), . . . , x n (y 1, . . . , y n )).
As G = F ◦ ϕ, we may use the chain rule to give
(∂αY f )(p) =
∂G
∂ yα
=
Y (p)
∂F
∂xβ
ϕ(Y (p))
∂xβ
∂ yα
=
Y (p)
∂xβ
∂ yα
(∂βX f )(p).
Y (p)
|{z}
=X(p)
and hence
∂αY | p =
∂xβ
∂ yα
∂βX | p .
Y (p)
Y
X
Thus if v has components v α = v(Y α ) in the Y coordinates and v α in the X coordinates, we have the
vector transformation rule
Xβ
β
v = v(X ) =
Yα Y
v ∂α | p (X β )
∂ xγ
=v
∂ yα
Yα
∂xβ
=
∂ yα
Y (p) | {z }
=δ
9 (of 63)
Yα
∂γX | p (X β )
v ,
Y (p)
β
γ
CJF, Autumn 2021-22
also expressed by saying that vector components transform contravariantly. Here, ∂ x β /∂ y α is the
Jacobian matrix for the transformation from Y to X coordinates. (NB In these expressions we are
not summing on X or Y – they are just there to tell us which coordinate system is in play.) The
inverse relationship is
∂yβ
Yβ
Xα
v =
v
,
∂ x α X(p)
because the matrices of partial derivatives are inverse to one another by the chain rule:
∂ xγ
∂yβ
Y (p)
∂yβ
∂ xα
=
X(p)
∂ xγ
∂ xα
γ
= δ α.
X(p)
2
1
Example 12. On a suitable open subset of R
p , consider Cartesian coordinates X (x, y) = x,
X 2 (x, y) = y and polar coordinates P1 (x, y) = x 2 + y 2 , P2 (x, y) = arctan(y/x). Then
x 1 (p1, p2 ) = p1 cos p2,
x 2 (p1, p2 ) = p1 sin p2
so
∂1P =
∂ x1
∂p1
and similarly
∂1X +
P(x,y)
∂ x2
∂p1
P(x,y)
∂2X = cos P2 ∂1X + sin P2 ∂2X = p
1
x2 + y2
x∂1X + y∂2X
∂2P = −P1 sin P2 ∂1X + P1 cos P2 ∂2X = −y∂1X + x∂2X .
It is usual to adopt a less formal notation and write these expressions as
∂r = cos θ ∂x + sin θ ∂y,
∂θ = −r sin θ ∂x + r cos θ ∂y = −y∂x + x∂y
Thus if f (x, y) = x 2 y, for example,
∂θ ( f ) = −y∂x ( f ) + x∂y ( f ) = −2xy 2 + x 3 = r 3 cos θ(cos2 θ − 2 sin2 θ) = r 3 cos θ(1 − 3 sin2 θ),
which can also be seen by noting that x 2 y = r 3 cos2 θ sin θ and differentiating w.r.t. θ.
An alternative definition of a vector, is that it is something that defines a set of components in
any coordinate system, and that these components in different coordinate systems are connected by
the vector transformation rule. Our more abstract approach has the virtue of explaining what the
‘something’ is, how it defines components in each coordinate system, and deriving the transformation
rule.
Finally, let us note that every vector v ∈ Vq is a tangent vector to a suitable curve, so the abstract
definition has a concrete interpretation. Choose a coordinate map X so that X(q) = 0 ∈ Rn and
write v = v α ∂αX |q . For simplicity suppose that the chart image is the whole of Rn , consider the curve
c(t) = X −1 (v 1 t, v 2 t, . . . , v n t) and observe that
f (c(t)) = ( f ◦ X −1 )(v 1 t, v 2 t, . . . , v n t) = F(v 1 t, v 2 t, . . . , v n t)
so
∂F
d
f (c(t))
= vα
dt
∂ xα
t=0
= v α ∂αX |q ( f ) = v( f ).
X(q)
of Rn
(If the chart image is not all
but some other open set containing the origin, there is a trick: use
c(t) = X −1 (v 1 η(t), . . . , v n η(t)) where η(t) = tan−1 (t/) for small enough > 0. You might amuse
yourself by seeing how and why this works.)
An application of this observation is that vector fields obey a chain rule: if m ∈ C ∞ (R), then
v(m ◦ f ) = m0( f (p))v( f )
holds for all v ∈ Vp . (See Problems 1.)
10 (of 63)
CJF, Autumn 2021-22
3.2
Covectors
Definition 13. A covector at p is a linear map from Vp to R (i.e., ω : Vp → R obeying ω(av + bw) =
aω(v) + bω(w) for a, b ∈ R, v, w ∈ Vp ). The covectors at p form a vector space Vp∗ , called the dual
space of Vp ; if ω, η ∈ Vp∗ and a, b ∈ R, then aω + bη ∈ Vp∗ is defined by
(aω + bη)(v) = aω(v) + bη(v).
A covector field is a linear map ω from the space of vector fields to F with the property that
ω( f v) = f ω(v)
for all vector fields v and all f ∈ F .
Example 14. Every f ∈ F determines a covector d f | p at any point p by
(d f | p )(v) = v( f )
∀v ∈ Vp
and a covector field d f , called the differential of f , by (d f )(v) = v( f ) for any vector field v.
Exercise: Check linearity. Note that
(d f )(gv) = (gv)( f ) = gv( f ) = g(d f )(v)
for all vector fields v and g ∈ F . The differential d is a linear map from F to the space of covector
fields, and obeys a Leibniz and chain rules:
d( f g) = f dg + gd f ,
reducing to
d(m ◦ f ) = (m0 ◦ f )d f
d( f g)| p = f (p)dg| p + g(p)d f | p,
d(m ◦ f )| p = m0( f (p))d f | p
at any p, which follow from the analogous rules for vectors (exercise). Armed with the product and
chain rules we can compute d f for most functions fairly easily. For example, if f = cos(θ) sin ϕ,
d f = sin ϕ d(cos θ) + cos θ d sin ϕ = − sin ϕ sin(θ) dθ + cos θ cos ϕ dϕ.
(?)
Any coordinate system X α determines a basis of covectors dX α | p at each p in its chart domain,
so Vp∗ is n-dimensional, like Vp . To see this, note that
dX α | p (v) = v(X α ) = v α,
and hence, if ω is any covector at p,
ω(v) = ω(v α ∂αX | p ) = v α ω(∂αX | p ) = ω(∂αX | p )(dX α | p )(v)
Since this holds for all v, we have
ω = ωα dX α | p,
where
ωα = ω(∂αX | p )
are the components of ω in this coordinate system. Note that the above calculation also shows that
ω(v) = ωα v α .
For example, in local θ, ϕ coordinates, the covector field d f from (?) has components (d f )θ =
− sin ϕ sin θ, while (d f )ϕ = cos θ cos ϕ.
11 (of 63)
CJF, Autumn 2021-22
The general transformation rule for covector components is easily deduced. Let Y be another
coordinate system so that
∂xβ
∂ X |p
∂αY | p =
∂ y α Y (p) β
Then the components of ω transform covariantly
!
β
∂x
∂xβ
Y
Y
X
ωα = ω(∂α | p ) = ω
∂ =
∂ y α Y (p) β
∂ yα
ω(∂βX | p )
Y (p)
∂xβ
=
∂ yα
X
ωβ.
Y (p)
If ω is a covector field then ωα = ω(∂αX ) defines smooth component functions on the chart
domain of X. If v is any vector field we now have (on the chart domain)
ω(v) = ω(v α ∂αX ) = v α ω(∂αX ) = v α ωα
where we used the fact that v α ∂αX is a sum of smooth multiples of vector fields and the ω( f v) = f ω(v)
property of a covector field to pull the v α component functions outside. We can also use the
components of ω to define a covector at each point p in the chart domain by
ω| p = ωα (p)dX α | p .
3.3
Tensors
Covectors at p ‘eat’ vectors at p: they are linear maps from Vp to R. Tensors ‘eat’ both vectors and
covectors in a ‘multilinear’ way. Here, if W1, W2, . . . , W k is a collection of vector spaces, a function
T : W1 × W2 × · · · × W k → R
is a multi-linear map if it is linear in each slot separately:
T(w1, . . . , w j−1, λw j + µz j , w j+1, . . . , w k ) = λT(w1, . . . , w j−1, w j , w j+1, . . . , w k )
+ µT(w1, . . . , w j−1, z j , w j+1, . . . , w k ).
Definition 15. A tensor of valence rs (or rs -tensor) at p is a multilinear map
T : Vp∗ × · · · Vp∗ × Vp × · · · × Vp → R.
{z
}
| {z } |
r times
s times
Example 16. Covectors are tensors of valence 01 and vice versa. Similarly, a vector v at p defines
a tensor of valence 10 by the linear map T(ω) = ω(v), and vice versa: if T is a 10 -tensor at p then
we define
v( f ) = T(d f | p )
and check that the linearity and Leibniz rules hold (exercise).3 A 00 -tensor of valence is a real
number – a scalar. Looking ahead, the metric on a manifold is a 02 -tensor. Finally, the Kronecker
δ is a 11 -tensor
δ : Vp∗ × Vp → R,
δ(ω, v) = ω(v).
3This is an instance of a theorem from linear algebra that any finite dimensional vector space is isomorphic to the
dual of its dual in a natural way, so Vp corresponds naturally to the linear maps from Vp∗ to R.
12 (of 63)
CJF, Autumn 2021-22
Given any coordinates X near p, a tensor T of valence
r
s
has X-components
T α1 ···αrβ1 ···βs = T(dX α1 , . . . , dX αr , ∂βX1 , . . . , ∂βXs )
in which it is understood that all the dX α ’s are really dX α | p ’s etc. By multilinearity one has
T(ω, . . . , ω, v, . . . , v ) = T α1 ···αrβ1 ···βs ωα1 · · · ωαr v β1 · · · v βs
1
r 1
s
1
r
s
1
(where all components are given in the same coordinate system). The transformation rule is
X
T
α1 ···αr
β1 ···βs
∂ x αr ∂ y δ1
∂ y δs Y γ1 ···γr
∂ x α1
··· γ
··· β T
=
δ1 ···δs
∂ y γ1
∂ y r ∂ x β1
∂x s
so the subscript indices transform covariantly and the superscript indices transform contravariantly.
Many older books define tensors as arrays of components transforming according to this rule.
Example 17. In any coordinates X α the components of δ are
(
1 α=β
δ αβ = δ(dX α, ∂βX ) = dX α (∂βX ) =
0 otherwise.
Note that the numerical components are the same in all coordinate systems.
We extend the tensor notation slightly. Suppose S is a multilinear map on r copies of Vp∗ and s
copies of Vp , but not necessarily in the order where all the Vp∗ ’s come first. For example, suppose
S : Vp × Vp∗ × Vp → R
Then there is a bona fide tensor that is defined by bringing all the Vp∗ ’s to the front but without
changing order within the vector or covector arguments. In our example, this tensor is
Ŝ : Vp∗ × Vp × Vp → R,
Ŝ(ω, u, v) = S(u, ω, v).
It is very convenient to write the components Ŝ αβγ of Ŝ as
Ŝ αβγ = Sβ
α
γ
which allows us to keep track of the original ‘pattern’ of arguments in S and avoids needing to write
Ŝ. For this reason we enforce a rule that indices are not written above each other: there is only one
index in any vertical slot. This is particularly useful when we start raising and lowering indices.
A tensor field is defined in a similar way to a tensor, except that we require it to be a multilinear
map from spaces of covector fields and vector fields to F , with the property that multiplication by
any of these (co)vector fields by a smooth function can be pulled outside, e.g.,
S( f ω, η, v) = f S(ω, η, v) = S(ω, f η, v) = S(ω, η, f v)
∀f ∈ F.
As with covector fields the components of a tensor field define smooth functions, and the condition
on pulling out smooth functions just given ensures that
T(ω, . . . , ω, v, . . . , v ) = T α1 ···αrβ1 ···βs ωα1 · · · ωαr v β1 · · · v βs
1
r 1
s
1
r
1
s
holds in any coordinate system, generalising the same result for covector fields.
13 (of 63)
CJF, Autumn 2021-22
3.4
Algebraic operations with tensors
All tensors in this subsection (except in Lemma 20) are evaluated at a given point p.
r
s -tensors
Linear combinations If S and T are two
aS + bT is defined by
and a, b ∈ R, then the linear combination
(aS + bT)(ω1, . . . , ωr , v1, . . . , vs ) = aS(ω1, . . . , ωr , v1, . . . , vs ) + bT(ω1, . . . , ωr , v1, . . . , vs ),
or, in components,
(aS + bT)α1 ···αr β1 ···βs = aS α1 ···αr β1 ···βs + bT α1 ···αrβ1 ···βs .
NB: each term in the above expressions has the same sets of contravariant and covariant indices.
Tensor products Let T be a rs -tensor and S be a
r+r 0
s+s 0 -tensor corresponding to the multilinear map
r 0
s 0 -tensor.
The tensor product T ⊗ S is the
T ⊗ S : Vp∗ × · · · Vp∗ × Vp × · · · × Vp × Vp∗ × · · · Vp∗ × Vp × · · · × Vp → R
{z
} | {z } |
{z
}
| {z } |
s times
r times
r 0 times
s 0 times
given by the product of the two maps T and S
(T ⊗S)(u1, . . . , ur , ω1, . . . , ωs, v1, . . . , vr 0, η1, . . . , ηs 0 ) = T(u1, . . . , ur , ω1, . . . , ωs )S(v1, . . . , vr 0, η1, . . . , ηs 0 )
or, in components,
(T ⊗ S)
α1 ···αr
β1 ···βs
γ1 ···γr 0
δ1 ···δs 0
= T α1 ···αrβ1 ···βs S
γ1 ···γr 0
δ1 ···δs 0 .
Note that S ⊗ T and T ⊗ S are different tensors in general.
Every rs -tensor is a finite linear combination of tensor products of r vectors and s covectors:
suppose X 1, . . . , X n are coordinates near p. Then T can be reconstructed from its X-components by
T = T α1 ···αrβ1 ···βs ∂αX1 ⊗ · · · ⊗ ∂αXr ⊗ dX β1 ⊗ · · · ⊗ dX βs
as can be seen by checking that the two sides define the same multilinear map. This shows how a
tensor can be reconstructed from its components in any one coordinate system.
Contractions Suppose that T is a tensor of valence rs with r, s ≥ 1. Choose one of the
contravariant and one of the covariant indices, say 1 ≤ i ≤ r and 1 ≤ j ≤ s. Then we can contract
T on its i’th contravariant and j’th covariant indices to form a new tensor S of valence r−1
s−1 . This
is most easily defined in components: if X 1, . . . , X n are coordinates near p, then we define S by its
components in the X-coordinate system
S α1 ···αr−1 β1 ···βs−1 = T
α1 ···αi−1 γαi ···αr
β1 ···β j−1 γ β j ···βs
where, as usual, the repeated γ index is summed. We may demand that the components in all other
systems are obtained by the tensor transformation rule. Importantly, the contracted tensor S does
not depend on the coordinate system used to define it. This can be seen by checking the identity
∂ x α1
∂ y δs Y γ1 ···γi−1 αγi ···γr
∂ x αr ∂ y δ1
·
·
·
T
·
·
·
δ1 ···δ j−1 αδ j ···δs
β1 ···β j−1 γ β j ···βs
∂ y γ1
∂ y γr ∂ x β1
∂ x βs
which is left as an exercise. Here it is crucial that we contract one contravariant and one covariant
index. A contraction over two covariant indices would not generally produce a tensor. Obviously
we can use an index other than γ for the contracted index (apart from the α’s and β’s, of course) –
this is called ‘relabelling a dummy index’. We can contract several pairs of indices but must use a
different dummy index for each pairing.
X α ···α γα ···α
i
r
1
i−1
T
=
14 (of 63)
CJF, Autumn 2021-22
Example 18. Tensor expressions involving contractions with δ can be simplified, e.g.,
δαβ S
The rules of index notation
following rules:
γδ
α
=S
γδ
δ αα = n.
β,
It becomes apparent that valid tensor expressions always obey the
• in any term of a valid tensor expression
– an index that appears twice must appear once as a contravariant index and once as a
covariant index (and is called a ‘dummy index’)
– no index appears more than twice, and the indices that appear once are called ‘free
indices’
• every term in a valid tensor expression has the same contravariant free indices and the same
covariant free indices
• free indices can be relabelled across an equation provided that the relabelling is done in the
same way in every term; dummy indices may be relabelled separately in different terms.
Therefore, when manipulating tensor expressions these rules must be respected. Sometimes this
requires some relabelling to forestall problems.
Example 19. Suppose S α = T
βα
β.
Then we may write S β = T
(S ⊗ S)αβ = S α S β = T
is a valid equation, but
(S ⊗ S)αβ = T
βα ββ
βT β
γβ
γ
(but NOT S β = T
ββ
β ).
Similarly,
γα δβ
γT δ
WRONG!
is not.
Everyone makes mistakes from time to time. If you find that your supposedly tensorial expression
does not conform to the rules of index notation, simply backtrack to find the place where the index
rules first failed and fix the problem, usually by suitable relabelling.
Many tensor fields arise in the following way.
0
Lemma 20. If L is a linear map from rs -tensor fields to rs 0 -tensor fields with the property that
0 +s
L ( f S) = f L (S) for all rs tensors S and all scalar fields f , then there is a rs 0+r
-tensor field T
such that
α ···α 0
µ ···µ
α ···α 0
(L S) 1 r β1 ···β 0 = T 1 r β1 ···β 0 1 sν1 ···νr S ν1 ···νr µ1 ···µs .
s
s
Proof: For private study – too many indices to write on a board! We define a multilinear map T on
r + s0 vector and s + r 0 covector fields by
T(ω, . . . , ω0 , v, . . . , v0, η, . . . , η, u, . . . , u) = L (u ⊗ · · · ⊗ u ⊗ η ⊗ · · · ⊗ η) (ω, . . . , ω0 , v, . . . , v0)
1
r
1
s 1
s 1
r
1
r
1
s
1
r
1
s
which has the property that the multiplication of any of its arguments by f ∈ F can be ‘pulled out’,
because of the properties of L (exercise).
Thus T defines a tensor field and we just need to check
r
the formula given. Given a general s -tensor field S and coordinates X, we may write
(L S)
α1 ···αr 0
β1 ···βs 0
= (L S)(dX α1 , . . . , dX αr 0 , ∂βX1 , . . . , ∂βXs 0 )
15 (of 63)
CJF, Autumn 2021-22
Then inserting
S = S ν1 ···νr µ1 ···µs ∂νX1 ⊗ · · · ⊗ ∂νXr ⊗ dX µ1 ⊗ · · · ⊗ dX µs
and ‘pulling out’ the smooth component functions,
(L S)
α1 ···αr 0
β1 ···βs 0
= S ν1 ···νr µ1 ···µs (L (∂νX1 ⊗ · · · ⊗ ∂νXr ⊗ dX µ1 ⊗ · · · ⊗ dX µs ))(dX α1 , . . . , dX αr 0 , ∂βX1 , . . . , ∂βXs 0 )
= S ν1 ···νr µ1 ···µs T(dX α1 , . . . , dX αr 0 , ∂βX1 , . . . , ∂βXs 0 , dX µ1 , . . . , dX µs , ∂νX1 , . . . , ∂νXr )
= S ν1 ···νr µ1 ···µs T
α1 ···αr 0
µ1 ···µs
β1 ···βs 0
ν1 ···νr .
16 (of 63)
CJF, Autumn 2021-22
4
Metric tensors
Definition 21. A metric tensor is a 02 -tensor field g that is symmetric and has an inverse metric,
i.e., a symmetric 20 tensor g −1 such that
(g −1 )αβ g βγ = δ αγ
holds in (one and hence) all coordinate systems.
Symmetry of g and g −1 means that
g(u, v) = g(v, u),
g −1 (ω, η) = g −1 (η, ω)
for all vectors u and v and covectors ω and η at any point p, or in components,
gαβ = g βα,
(g −1 )αβ = (g −1 ) βα
in any coordinate system. To see these are the same, compute:
gαβ = g(∂αX , ∂βX ) = g(∂βX , ∂αX ) = g βα .
Written as a matrix, the components of g −1 form the inverse matrix to those of g, because the
components of δ give the identity matrix.
Convention: When only one metric is in play it is conventional to write (g −1 )αβ as g αβ – a useful
abuse of notation. Thus we have g αβ g βγ = δ αγ . Using symmetry of g and g −1 we obtain similar
expressions, δ αγ = g αβ gγ β = g βα gγ β = g βα g βγ . The metric and inverse metric can be reconstructed
from their components, like any tensors.
g = gαβ dX α ⊗ dX β,
g −1 = g αβ ∂αX ⊗ ∂βX .
Example 22. The Euclidean metric e on R3 is defined so that
e(∂i, ∂ j ) = δi j ,
in Cartesian coordinates (x 1, x 2, x 3 ) = (x, y, z). Spherical polar coordinates (r, θ, ϕ) are defined on
a suitable chart domain in R3 so that
x = r sin θ cos ϕ
y = r sin θ sin ϕ
z = r cos θ
We can find the basis vector fields in spherical polar coordinates, e.g., expanding
∂θ =
∂y
∂z
∂x
∂x +
∂y +
∂z = r cos θ cos ϕ∂x + r cos θ sin ϕ∂y − r sin θ∂z
∂θ
∂θ
∂θ
The metric components are found by substituting into e, e.g.
2 2 2
∂x
∂y
∂z
eθθ = e(∂θ , ∂θ ) =
+
+
= r2
∂θ
∂θ
∂θ
where we have used multilinearity of e, its Cartesian components, and simplified. Similar calculations [exercise] give the nonzero components as
err = 1,
eθθ = r 2,
17 (of 63)
eϕϕ = r 2 sin2 θ
CJF, Autumn 2021-22
and all others vanishing, i.e.,
e = dr ⊗ dr + r 2 dθ ⊗ dθ + r 2 sin2 θ dϕ ⊗ dϕ.
(We find exactly the same results by using the tensor transformation rule between Cartesian
and spherical polar coordinates.) In matrix form, the components form a diagonal matrix
diag(1, r 2, r 2 sin2 θ) with inverse matrix diag(1, 1/r 2, 1/(r 2 sin2 θ)). Therefore the nonzero components of the inverse metric are
err = 1,
eθθ =
1
,
r2
e ϕϕ =
1
r 2 sin2
θ
so we could write
1
1
∂
⊗
∂
+
∂ϕ ⊗ ∂ϕ
θ
θ
r2
r 2 sin2 θ
A bonus from this example is that we can read off the metric of the two-sphere S2 – this is defined as
the r = 1 surface in R3 . Any curve c(t) lying in S2 has tangent vectors that are linear combinations
of ∂θ and ∂ϕ , because their component along ∂r vanishes:
e−1 = ∂r ⊗ ∂r +
Û = c(r)
Û =
cÛr = dr(c)
d
d
(r(c(t))) = 1 = 0.
dt
dt
Therefore the metric on S2 is obtained from e expressed in polar coordinates by setting r = 1 and
restricting to the metric components that do not involve dr (we can think of this as setting dr = 0)
to give the S2 metric
g = dθ ⊗ dθ + sin2 θ dϕ ⊗ dϕ
We will describe another method for finding this metric shortly.
Notation: A common notation is to write dX α dX β = 21 (dX α ⊗ dX β + dX β ⊗ dX α ) = dX β dX α , so
that
using symmetry of gαβ
g = gαβ dX α ⊗ dX β = 12 gαβ + g βα dX α ⊗ dX β
= 21 gαβ dX α ⊗ dX β + 12 g βα dX α ⊗ dX β
α
β
β
α
1
= 2 gαβ dX ⊗ dX + dX ⊗ dX
relabelling
= gαβ dX α dX β
Meanwhile the metric tensor itself is often written ds2 (this is completely standard and equally
illogical – in general g is not of the form ds ⊗ ds for some function s!) leading to expressions like
ds2 = dx 2 + dy 2 + dz2
for the Euclidean metric, or
ds2 = dt 2 − dx 2 − dy 2 − dz2
for the metric of special relativity. The components of g in the given coordinates may be read off
easily from the coefficients in such expressions. For example, if
g = dt 2 + 2dtdx − dx 2 − dy 2 − 4dxdy
then the nonzero metric components in (t, x, y) coordinates are gtt = 1, gt x = 1 = g xt , g x x = −1,
gyy = −1, g xy = −2. Note that the value of gt x is half the coefficient of dtdx, which is equal to
gt x + g xt .
Because dXdY = dY dX in this product, we can calculate according to familiar algebraic rules.
The following example illustrates this, along with another way of finding the S2 metric.
18 (of 63)
CJF, Autumn 2021-22
Example 23. The standard metric on S2 is inherited from 3-dimensional Euclidean metric and can
be found by parametrising it in spherical polar coordinates (θ, ϕ) on a suitable patch,
x = sin θ cos ϕ
y = sin θ sin ϕ
z = cos θ
and applying the Leibniz rule to find
dx = cos θ cos ϕ dθ − sin θ sin ϕ dϕ
dy = cos θ sin ϕ dθ + sin θ cos ϕ dϕ
dz = − sin θ dθ.
Next, we substitute these expressions into the Euclidean metric ds2 = dx 2 + dy 2 + dz2 . To do this
we calculate
dx 2 = (cos θ cos ϕ dθ − sin θ sin ϕ dϕ)2
= cos2 θ cos2 ϕ dθ 2 − 2 cos θ cos ϕ sin θ sin ϕ dθ dϕ + sin2 θ sin2 ϕ dϕ2
and a similar expression for dy 2 , while dz 2 = sin2 θ dθ 2 . Simplifying, the metric on S2 is
ds2 = dx 2 + dy 2 + dz2 = dθ 2 + sin2 θdϕ2
as before.
X
Signature Fix a point p. In given coordinates, the metric g αβ at p is a real symmetric matrix, and
we may therefore find n orthonormal eigenvectors with corresponding eigenvalues, all of which are
X
nonzero because g is invertible. Let n+ be the number of positive eigenvalues and n− be the number
of negative eigenvalues. The pair (n+, n− ) is the signature of the metric and is independent of the
coordinate system used; moreover one can find coordinates Y in which
Y
g •• = diag(+1, . . . , +1, −1, . . . , −1).
| {z } | {z }
n+
n−
Proof. (Nonexaminable) In given X-coordinates there is an orthogonal matrix S such that
X
ST gS = diag(λ1, . . . , λn+ , −µ1, . . . , −µn− )
where all the λi and µi are strictly positive. (We construct S from columns given by orthonormal
X
eigenvectors for g.) Then let R = diag((λ1 )−1/2, . . . , (λn+ )−1/2, (µ1 )−1/2, . . . , (µn− )−1/2 ) and set J =
SR. Then
X
J T gJ = diag(+1, . . . , +1, −1, . . . , −1).
| {z } | {z }
n+
n−
Now define coordinates Y α = (J −1 )α β X β (i.e., constant linear combinations of the X-coordinate
functions) so that
∂ xα
X α = J αβY β,
and hence
= J αβ
β
∂y
whereupon
∂ x µ ∂ xν X
X
Y
g αβ =
g µν = (J T gJ)αβ
α
β
∂y ∂y
Y
and hence g has the required form.
19 (of 63)
CJF, Autumn 2021-22
Evidently we may associate a signature with a metric at each point. In GR we tend to express
the signature by a string of pluses and minuses. The main signatures of interest to us will be
Riemannian (+ + · · · +), [signature (n, 0)] e.g., the S2 metric or standard Euclidean metric on Rn
Lorentzian (+ − · · · −), [signature (1, n − 1)] e.g., metrics in special and general relativity.
Many GR books use (− + · · · +) for Lorentzian signature. This is completely equivalent but means
that many signs have to be changed if converting between the two. Neither convention is ‘right’ but
our choice has the advantage that g(u, u) > 0 for timelike vectors, as we will see shortly.
Riemannian geometry A Riemannian manifold is a manifold with a Riemannian metric g. It
holds that g(u, u) > 0 for all nonzero vectors u. The length of u ∈ Vp is defined by
p
kuk = g(u, u)
where it is understood that the metric is evaluated at p. The angle θ between u, v ∈ Vp is given by
cos θ =
g(u, v)
kuk kvk
generalising standard Euclidean geometry, with g(u, v) playing the role of the dot product between u
and v. Evidently the dot product can change with p, unlike in Euclidean geometry. If c is a smooth
curve, the length of the segment c([a, b]) is
L=
∫
b
Û
k c(t)k
dt =
a
∫
bp
Û c(t))
Û dt.
g(c(t),
a
Lorentzian geometry A Lorentzian manifold is a manifold with a Lorentzian metric g. There are
now three types of vector at each point p.
• If g(u, u) > 0 we say that u is timelike
• If g(u, u) = 0 we say that u is null (or lightlike) [so u = 0 is null by convention]
nu
ll
eli
tim
ll
nu
As in SR, the timelike vectors at p lie inside a double
cone in Vp , while the null vectors lie on the cone and
the spacelike ones point outside it. There should be a
continuous choice of which half-cone contains the futurepointing timelike vectors. One difference with SR is that
in SR we identify Vp with spacetime, whereas in GR each
Vp is a different vector space. Nonetheless, we can sketch
local light cones to get an idea of what is happening in
spacetime.
ke
• If g(u, u) < 0 we say that u is spacelike
ke
spaceli
Example 24. A two-dimensional metric is given in (t, r)
coordinates with t ∈ R, r > R by
g = (1 − R/r)dt 2 −
dr 2
1 − R/r
20 (of 63)
CJF, Autumn 2021-22
for constant R > 0. This is a cut-down version of the Schwarzschild metric for a black hole, with R
as the Schwarzschild radius. A vector u in these coordinates is null if
0 = g(u, u) = (1 − R/r)(ut )2 −
i.e.,
(ur )2
1 − R/r
ur = ±(1 − R/r)|ut |
For r R, the null directions are approximately as in 2-dimensional Minkowski spacetime with
ur ≈ ±ut . However as r decreases towards R, the null rays ‘tighten up’. Note that the vector u = ∂t
is timelike: g(∂t , ∂t ) = 1 − R/r > 0. The lightcone structure is independent of t. We can sketch these
null directions on spacetime, essentially making a local identification between Vp and M near p so
that the origin of Vp is placed at p.
In the plot, the cone of future-pointing timelike directions is shaded pink and null directions are
indicated in blue
r=R
t
r
We can classify curves in a similar way: saying that c is
Û is timelike for all t
• timelike if c(t)
Û is null for all t
• null if c(t)
Û is spacelike for all t
• spacelike if c(t)
A typical curve fits none of these descriptions, but the ones of interest in GR generally do. In
particular: massive particles follow timelike curves, while massless particles follow null curves.
We can define the length of a segment c([a, b]) of a spacelike curve almost as in the Riemannian
case
∫ bp
Û c(t))|
Û
|g(c(t),
dt.
L=
a
If c is timelike, the analogous integral defines the elapsed proper time along the segment, and of
course we don’t need to put in the absolute value sign. It follows that the parameter t measures
Û c(t))
Û
proper time along c if and only if g(c(t),
= 1 for all t. There is not much point in considering
length or proper time along null curves!
21 (of 63)
CJF, Autumn 2021-22
Raising and lowering indices A metric defines an invertible linear map between the spaces of
vectors and covectors at any point. Given u ∈ Vp , we define u[ ∈ Vp∗ so that
u[ (v) = g(u, v)
for all v ∈ Vp
or, in components (u[ )α = gαβ u β . The inverse operation, mapping ω ∈ Vp∗ to ω] ∈ Vp , is given in
a similar way by the inverse metric
η(ω] ) = g −1 (η, ω)
for all ω, η ∈ Vp∗
or, in components, (ω] )α = g αβ ω β . To see that ] and [ are inverse operations, we calculate
((u[ )] )α = g αβ (u[ ) β = g αβ g βγ uγ = δ αγ uγ = uα
for any u ∈ Vp . A similar calculation shows that ((ω] )[ ) = ω for all ω ∈ Vp∗ .
When there is only one metric on the scene, it is conventional to write the ‘musical isomorphisms’
simply by raising or lowering indices. So we write uα for (u[ )α and ωα for (ω] )α . The notational
convention is extended to general tensors, so that contraction of the metric with a contravariant
index lowers it, while contraction of the inverse metric with a covariant index raises it, e.g.,
S αβγ gαδ = Sδβγ ,
S αβγ g βδ = S αδγ
This is really just a shorthand notation. If S was originally defined as a 12 tensor, then the
components Sδβγ had no prior meaning, so the above equation defines the right-hand side rather
than asserts an equality between two already defined quantities. This is also why we have avoided
stacking contravariant indices directly above covariant ones. Although it saves some space it can
create confusion!4 In cases where more than one metric is considered it is important to be clear
which is used to raise or lower indices.
Example 25. Lowering the first index on the Kronecker δ gives
δαβ = gαγ δ
γ
β
= gαβ
by properties of δ. Now raise the second index:
β
δα = g βγ δαγ = g βγ gαγ = g βγ gγα = δ
β
α
β
β
using symmetry of g and the definition of the inverse metric. Because δα = δ α there is no
ambiguity in saving a little space by writing δ αβ (this is the exception that proves the rule!).
αµ
4For instance one has to know whether Sν was the result of raising the first or second covariant index of S αβγ .
22 (of 63)
CJF, Autumn 2021-22
5
Geodesics
Newton’s first law asserts that a body in force-free motion follows a straight line in space, which
– Euclidean geometry tells us – always provides the shortest path between any of its points. That
means that if we consider all possible smooth paths from one point to the other, the straight line
achieves the global minimum of the distance. Similarly, in special relativity, inertial particles and
light rays follow straight line paths in spacetime. In general relativity we replace the idea of a
straight line by the notion of a geodesic. We will find that geodesics manage to be both ‘as straight
as possible’ and – while not being mininum distance paths in general – are stationary points for the
Lorentzian distance among all possible curves between two given points. According to the geodesic
hypothesis of General Relativity, freely falling massive particles follow timelike geodesics, while
freely falling photons follow null geodesics. We start by understanding the ‘stationary distance’
formulation of a geodesic and then return in later sections to understand the ‘as straight as possible’
version.
Suppose g is a metric (either Riemannian or Lorentzian) and let c(τ)
be a smooth curve. For simplicity, let us refer to
c(b)
L=
∫
bp
Û c)|
Û dτ
|g(c,
a
as the length of the segment from c(a) to c(b) [even though it is a proper
time if c is a timelike curve in a Lorentzian spacetime]. Now let cs
be any family of curves, depending smoothly on a parameter s so that
c0 = c and cs (a) = c(a), cs (b) = c(b). This depends smoothly on s near
s = 0 in the Riemannian case, or in the Lorentzian case if c is either
everywhere timelike or everywhere spacelike. We call the family cs a
variation of c. We are interested in those curves for which L is stationary,
i.e., dL/ds|s=0 = 0 for all variations of c. In fact it turns out to be simpler
to change the problem slightly, to get rid of the modulus and square root.
This apparent cheat can be justified.
cs
c = c0
c(a)
Definition 26. c is a geodesic between c(a) and c(b) if dE/ds|s=0 = 0
for all variations of c, where
∫ b
Û c)
Û dτ.
E=
g(c,
a
An important point here is that we have not used coordinates to define what a geodesic is.
Therefore all coordinate systems should agree on which curves are geodesics. To find them, we use
the following result, which is also used in the variational formulation of Lagrangian mechanics and
proved (nonexaminably!) in Section 5.1.
Theorem 27. Suppose X is a coordinate map whose chart domain contains the segment c([a, b]).
Then c is a geodesic if and only if x α (τ) = X α (c(τ)) solves the Euler-Lagrange equations
d ∂L ∂L
−
=0
dτ ∂ xÛ α ∂ x α
for Lagrangian L (x 1, . . . , x n, xÛ 1, . . . , xÛ n, τ) = gαβ | X −1 (x) xÛ α xÛ β .
We can now use tricks from Lagrangian mechanics to find geodesics in a given coordinate
system. For instance, we notice immediately that L has no explicit dependence on the ‘time’
parameter τ. There is a corresponding conserved quantity – the Jacobi integral – which in this
23 (of 63)
CJF, Autumn 2021-22
Û
instance turns out to be L itself. Therefore, if the initial tangent vector c(a)
is timelike (resp.,
Û is a unit timelike vector then so is cÛ everywhere,
null or spacelike), so is the whole curve c; if c(a)
Û is a unit spacelike vector then
which means that c is parametrised by proper time. Similarly, if c(a)
c is parametrised by arc length. In the null case, we simply call τ an affine parameter – it has no
intrinsic physical meaning, because there is no preferred normalisation for a null tangent vector.
The other situation in which a Lagrangian system has conserved quantities, is when it is
independent of one of the generalised coordinates x γ , for then we see that the corresponding
generalised momentum ∂L /∂ xÛ γ is constant. Note:
∂
β
gαβ xÛ α xÛ β = gαβ δ αγ xÛ β + gαβ xÛ α δ γ = gγ β xÛ β + gαγ xÛ α = 2gγ β xÛ β .
∂ xÛ γ
Dividing by a factor, this means that gγ β xÛ β is conserved along the geodesic for this value of γ. This
can simplify the solution of the system. Evidently this situation certainly occurs when
∂gαβ ◦ X −1
=0
∂ xγ
for all α, β, and the particular value of γ in question.
It is advisable to find symmetries where we can, because the full form of the equations is slightly
involved: we have (with some judicious relabelling)
2
∂g
d
γβ γ β
xÛ xÛ = 0
gαβ xÛ β −
dτ
∂ xα
(also writing ∂gγ β /∂ x α as a shorthand for
2 xÜ β gαβ + 2
∂
∂ x α gγ β
◦ X −1 (x 1, . . . , x n )) and so
∂gαβ γ β ∂gγ β γ β
xÛ xÛ −
xÛ xÛ = 0
∂ xγ
∂ xα
so raising the index, with a little more relabelling,
1 ∂gγ β β γ
α
αδ ∂gδβ
xÜ + g
xÛ xÛ = 0,
−
∂ xγ
2 ∂ xδ
which is a system of nonlinear coupled second order equations. Good luck with that, as they say.
Example 28.
1. Minkowski spacetime in Cartesian coordinates (or any other situation in which all the metric
components are constant). All the metric derivatives vanish so the equations become xÜα = 0,
i.e., the solution is
x α (τ) = x α (0) + τ xÛ α (0)
in terms of the initial position and velocity. Timelike geodesics are straight lines in these
Û
Û
coordinates. For a proper time parametrisation, we require g( x(0),
x(0))
= 1.
2. The cut Euclidean plane R2 \ {(x, 0) : x ≥ 0} in plane polar coordinates has metric
g = dr 2 + r 2 dθ 2
In this case, grr = 1, gθθ = r 2 and grθ = gθr = 0. We seize on the fact that g has no explicit θ
dependence. Thus
h = gθ β xÛ β = gθθ θÛ = r 2 θÛ
24 (of 63)
CJF, Autumn 2021-22
is constant along the geodesics. You should be powerfully reminded of angular momentum.
While we could write out the other Euler–Lagrange equation it is simpler to turn to the other
conserved quantity,
h2
L = gαβ xÛ α xÛ β = rÛ2 + r 2 θÛ2 = rÛ2 + 2
r
and indeed L ≡ 1 in an arc-length parametrisation. Thus
p
r rÛ = ± r 2 − h2
which is separable and solved by
±τ =
∫
r(τ)
r(0)
p
p
rdr
= r(τ)2 − h2 − r(0) − h2
√
r 2 − h2
from which we find
r(τ) =
p
(τ − τ0 )2 + h2
for a constant τ0 , which is the parameter value at which r is minimised. Solving r 2 θÛ = h as a
differential equation for θ, we find
θ(τ) = θ 0 + tan−1
τ − τ0
h
It is not hard to see that the resulting curves are straight lines, written in polar coordinates
(exercise). This is the expected result.
3. For the round 2-sphere metric g = dθ 2 + sin2 θdϕ2 , one may show similarly (exercise!) that
geodesics obey
θÜ − sin θ cos θ ϕÛ2 = 0
ϕÜ + 2 cot θ ϕÛ θÛ = 0.
The resulting curves are great circles, e.g., curves with constant ϕ.
4. The Schwarzschild metric is given by
g = (1 − R/r)dt 2 −
dr 2
− r 2 (dθ 2 + sin2 θdϕ2 )
1 − R/r
in (t, r, θ, ϕ) coordinates for constant R > 0 (the Schwarzschild radius). Clearly the ‘La...

Purchase answer to see full attachment

Purchase answer to see full attachment

User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Anonymous

Very useful material for studying!