Laboratory 8
Building a synthetic gene
One of the more striking developments in molecular biology in the past decade
has been methods of gene synthesis, using chemically synthesized templates and DNA
polymerases. If you know the DNA sequence you want, for example a coding sequence
from an organism that has been sequenced in its entirety, you can now order it online
from a company for less than $0.20 per nucleotide base. The DNA might be shipped to
you in segments of about 500 bp for assembly, and that’s when you need to use a DNA
polymerase to knit these segments together. In this lab you’re going to assemble a very
small gene from short single-stranded DNA pieces using a DNA polymerase, and analyze
your result by agarose gel electrophoresis. The technique is essentially an abbreviated
kind of polymerase chain reaction, which is a technique that has revolutionized the fields
of molecular and cellular biology, genetics, immunology, cancer biology, and
biotechnology. Scientists with DNA skills are employed by many universities and
biotech/biomed companies in the Los Angeles area.
Here are some examples of ways that gene synthesis is used:
•
•
•
We can learn how a gene product functions in a cell or organism by making the
gene nonfunctional or different at the DNA level, and then seeing what
phenotypes result. With the ability to build new versions of genes with a desired
sequence, we can have nearly complete control of the genotype.
We can learn how an enzyme functions by mutating the coding sequence of its
gene so that an amino acid is changed. We do that type of genetic engineering
work at the level of DNA, and a modified enzyme can be produced in a host cell
by transcription and translation.
We can now even make microorganisms with entirely synthetic genomes, which
may allow us to readily develop organisms with novel regulatory and metabolic
pathways.
In this laboratory you’re going to build an entirely synthetic gene with the following 180
nt (nucleotide) sequence:
ATGTACTGCG
CCTGCTGCAC
GCGAGGCCGA
AGCTGCATCG
CCCTGAGCAC
GAGCTGCCCA
CGCCAACGAC
AGAACACCAT
CGCCACCGAG TGCCTGGCCA GCAGCTGGAT
TGGAGCTGGA GGCCCGCAAC GCCAACGACC
ACCCACATCA ACAAGCTGAT CAAGGAGGCC
CAGCACCTAG
This gene is designed to encode a specific and meaningful amino acid sequence, and
you can parse the sequence into triplet codons and use a genetic code table (or web-based
application) to perform the translation. The first 10 codons are decoded below, using 1letter amino acid abbreviations, and spell out “My Cal State.”
M
Y
C
A
L
S
T
A
T
E
ATG TAC TGC GCC CTG AGC ACC GCC ACC GAG
...
...
1
Test your understanding of the laboratory:
• Can you decode a DNA sequence into a one-letter amino acid sequence?†
(Several methods are suggested in the lab section titled “Translating a nucleic acid
sequence into an amino acid sequence” – try it with the entire 180 nt)
The synthetic gene will be built from six different short single-stranded DNAs
(“oligonucleotides”), base paired (“annealed”) to each other and extended by multiple
rounds of DNA polymerization. Part of the assembly process is a polymerase chain
reaction, in which the replication leads to a doubling or amplification of the number of
DNA copies in each synthetic cycle. In the first stage, three double-stranded DNA
fragments are constructed by annealing of pairs of oligonucleotides (numbered arrows
below, where the arrowheads are the free 3’ ends) and extension through DNA synthesis
(dashed lines):
In the next round of synthesis, these products anneal with each other at their 3’ ends and
are extended to make two longer constructions:
These two products can anneal with each other in the third round of synthesis to make a
full-length product of 180 nt:
*see end of lab for further explanation of this process
Test your understanding of the laboratory:
• Can you show how these two oligonucleotides can base pair at their 3’ ends?†
1. 5’ATGTACTGCGCCCTGAGCACCGCCACCGAGTGCCTGGCCAGCAGC
2. 5’CTCCATGGGCAGCTCGTGCAGCAGGATCCAGCTGCTGGCCAGGCA
Hint: the strands in double-straded DNA are antiparallel.
• Once these are annealed, can you show how DNA polymerase would make a
double-stranded product using base-pairing rules?†
2
Once the full-length sequence is assembled, it can be copied by polymerase chain
reaction, using an excess of oligonucleotides 1 and 6 (the distal ones). The details of the
DNA squences and reactions mixtures are provided in the back pages of this lab (the
section titled “Assembly and amplification”).
As you prepare for the lab, you should do the following things first:
• Read through the instructions for this laboratory before you commence work.
Listen carefully to instructions from your laboratory instructor and take notes so
you won’t make mistakes later.
• Familiarize yourself with the location of the supplies and equipment you’re going
to be using in the lab.
Laboratory instructions
In this laboratory you’ll be building a small 180 nt synthetic gene by polymerase chain
reaction, and analyzing its mobility by gel electrophoresis. You’ll have two reaction
tubes, one with all the components needed for synthesis, and a “negative control” that
contains all components except DNA polymerase.
1. Set up and run your polymerase chain reactions.
The DNA solution has all the dNTP
A. Pipet 5 µl of DNA solution into each of
two tubes. Label one tube as “experimental” substrates and oligonucleotides
and the other “control”
The Enzyme solution contains the
B. Pipet 5 µl of Enzyme solution into the
DNA polymerase in a buffer, and the
experimental, and 5 µl of Buffer alone
Buffer alone solution is for your
solution into the control, and vortex each
negative control
tube.
C. Place your tubes in the thermocycler,
The thermocycler takes the tubes
recording the tube positions (the labels may
through a programmed series of
wear off!) Your lab instructor will initiate
heating and cooling steps.
the synthesis program.
During the time the thermocycler is operating, you can watch its progress on the
screen. You should also use the time to plan the next steps in your lab.
2. Load and run your gel.
D. When the program is complete, remove
your tubes and add 2.5 µl of 5x FlashGel®
Loading Dye to each tube. Vortex.
E. Load 5 µl of your samples into the gel
lanes indicated by your instructor. Gently
squeeze the pipetter button to release the
sample into the well.
Adding 2.5 µl of a 5x dye mix to 10
µl is a 1:5 dilution of the dye (2.5
µl:12.5 µl), which becomes “1x”
Work efficiently – everyone needs to
load samples on a shared gel so be
ready when it is your turn to load
and then move aside.
3
F. Once the gel is loaded, including the
The voltage may be set as high as
DNA marker, your instructor will start the
275 V, so do not touch the gel
electrophoresis.
apparatus while it is running.
During the brief period of electrophoresis, you can watch the progress of the bands
by turning on the blue light in the base. The gel has a proprietary non-ethidium
bromide stain in it that fluoresces orange when bound to DNA. As a polyanion,
DNA always moves towards the anode (+, or “red”) terminal. The gel matrix
impedes progress, so the smaller fragments migrate more rapidly.
3. Analyze the gel results.
G. Your instructor will help you obtain a
print of your gel. Using that and a ruler,
determine the distance of migration from
the gel origin (where you loaded the
samples) to the bands in the DNA marker
lane.
H. Prepare a table with two columns: The
distances of migration (in cm or mm) and
the sizes of the known DNA marker (in bp).
These will be plotted on the x and y axes of
a graph, respectively (see next page)
I. Graph your results using the 3-cycle semilog paper (at the end of this lab). The
bottom cycle should be used for sizes of 10100 bp, the middle cycle for sizes of 1001000 bp, and the top cycle for sizes of 100010000 bp.
J. Can the data points be modeled as a
straight lines for any parts of the graph? Is it
reasonable to extrapolate the graph?
K. Using your graph, determine the distance
of migration of the product(s) in your
sample lanes. Did you obtain the expected
180 bp product?
Start measuring from the bottom of
the well, for consistency.
By first mapping out the migrations
of DNA fragments of known sizes,
you’re essentially making a
“standard curve”
Most of your markers will fall into
the middle cycle of the y-axis.
Your laboratory instructor will
discuss this with you.
Use the ruler again to measure
distance in your sample well (from
the same well position). What DNA
size on your standard curve appears
to correspond to your experimental
results?
Clean up
Return any unused DNA, Enzyme, and Buffer solutions to your instructor.
Dispose of your tubes and plastic tips in the appropriate recepticles, and wash your hands
before you leave the room.
4
Gel analysis
The DNA markers are a collection of linear DNA fragments of known size, and your
laboratory instructor will give you a list of them if they are not the same as the ones
shown below. Depending on how the gel was run, some of the larger bands may not be
resolved. You can prepare a simple table associating each DNA fragment size with its
migration, then plot the results using semi-log graph paper*. The sample gel below shows
the effects of conducting the polymerization through 0, 1, 2, 4, 7, 10, or 12 cycles. The
DNA markers are on the far left lane, and the bottom-most marker is 100 bp.
(x-axis)
Gel migration
(y-axis)
DNA size (bp)
4000
2000
1250
800
500
300
200
100
* The y-axis scale is “semi-logarithmic”, so if you plot the results with a computer
program instead of on paper, be sure to either select a logarithmic representation of the yaxis scale or take the log of the y-axis data before graphing it in a linear format
Once you’ve completed a graph of your DNA markers, you will see that one or more
parts of the graph might be represented continuously as straight lines (see example
below). This allows us to estimate the sizes of unknown bands in your sample.
Is one of these graphs “right” and the other “wrong”?
5
Translating a nucleic acid sequence into an amino acid sequence
•
One way to translate a sequence is by hand, using a genetic code table:
http://web.ornl.gov/sci/techresources/Human_Genome/posters/chromosome/gencode.shtml
•
There are also internet applications that will translate pasted DNA sequences:
http://www.cbs.dtu.dk/services/VirtualRibosome/
•
You can also write your own computer programs to analyze DNA, for example
the following simple Python language program. Computer skills are becoming
increasingly important in biology careers, and Python is widely used. You can
read more about the Python language at http://www.python.org, and find a tutorial for
beginners at http://en.wikibooks.org/wiki/Non-Programmer%27s_Tutorial_for_Python_2.6
# This simple Python program accepts a DNA sequence and prints out its translated
# sequence, using the first reading frame in the DNA. Note: If you copy and paste this
# program into a file, make sure the indentations are preserved
#
# Database of genetic code associations between triplet codons and amino acids…
ribosome = {'AAA':'K', 'AAG':'K', 'AAC':'N', 'AAT':'N', 'AGA':'R', 'AGG':'R', 'AGC':'S',
'AGT':'S', 'ACA':'T', 'ACG':'T', 'ACC':'T', 'ACT':'T', 'ATA':'I', 'ATG':'M', 'ATC':'I',
'ATT':'I', 'GAA':'E', 'GAG':'E', 'GAC':'D', 'GAT':'D', 'GGA':'G', 'GGG':'G', 'GGC':'G',
'GGT':'G', 'GCA':'A', 'GCG':'A', 'GCC':'A', 'GCT':'A', 'GTA':'V', 'GTG':'V', 'GTC':'V',
'GTT':'V', 'CAA':'Q', 'CAG':'Q', 'CAC':'H', 'CAT':'H', 'CGA':'R', 'CGG':'R', 'CGC':'R',
'CGT':'R', 'CCA':'P', 'CCG':'P', 'CCC':'P', 'CCT':'P', 'CTA':'L', 'CTG':'L', 'CTC':'L',
'CTT':'L', 'TAA':'*', 'TAG':'*', 'TAC':'Y', 'TAT':'Y', 'TGA':'*', 'TGG':'W', 'TGC':'C',
'TGT':'C', 'TCA':'S', 'TCG':'S', 'TCC':'S', 'TCT':'S', 'TTA':'L', 'TTG':'L', 'TTC':'F',
'TTT':'F'}
# Here's the program ...
seq = '' # build the clean DNA sequence in this variable
polypeptide = '' # build the amino acid sequence in this variable
entered = raw_input('Enter an upper-case DNA sequence ') # get a DNA sequence from user
for n in entered:
if n in ['G', 'A', 'T', 'C']:
seq += n # keep the nucleotides, exclude the other characters
for x in range(0, len(seq), 3): # go through the sequence three characters at a time
thisCodon = seq[x:x+3] # this is the triplet codon sequence
polypeptide += ribosome[thisCodon] # add the amino acid to the growing chain
print
# blank line
print polypeptide
# all done - print the result. Wasn't that easy?
6
There are 20 types of amino acids encoded in the standard genetic code, and these are
listed and organized below.
Name
3-letter
1-letter
Nonpolar and aliphatic
Glycine
Alanine
Valine
Leucine
Isoleucine
Proline
Gly
Ala
Val
Leu
Ile
Pro
G
A
V
L
I
P
Tyrosine
Phenylalanine
Tryptophan
Tyr
Phe
Trp
Y
F
W
Serine
Threonine
Ser
Thr
S
T
Arginine
Lysine
Histidine
Arg
Lys
His
R
K
H
Aspartic acid
Asparagine
Glutamic acid
Glutamine
Asp
Asn
Glu
Gln
D
N
E
Q
Methionine
Cysteine
Met
Cys
M
C
Aromatic
Polar uncharged
Basic
Acids and amides
Sulfur-containing
7
Assembly and amplification
Your polymerase chain reaction solutions have been simplified to make it easy for you to
set up the experiment. Your solutions are prepared in a “1x” reaction buffer that is
optimal for the enzyme:
Buffer alone:
25 mM TAPS-HCl buffer (pH 9.3 @ 25°C)
50 mM KCl
2 mM MgCl2
1 mM β-mercaptoethanol
In addition to the reaction buffer, the DNA and enzyme solutions have the following
specific additives:
DNA solution:
250 μM each of dGTP, dATP, dTTP, and dCTP
0.5 μM of this oligonucleotide:
1. 5’ATGTACTGCGCCCTGAGCACCGCCACCGAGTGCCTGGCCAGCAGC
0.05 μM each of these oligonucleotides:
2. 5’CTCCATGGGCAGCTCGTGCAGCAGGATCCAGCTGCTGGCCAGGCA
3. 5’CACGAGCTGCCCATGGAGCTGGAGGCCCGCAACGCCAACGAC
4. 5’GTCGTTGGCGTCGGCCTCGCGGTCGTTGGC
5. 5’GCCAACGACACCCACATCAACAAGCTGATCAAGGAGGCC
0.5 μM of this oligonucleotide:
6. 5’CTAGGTGCTGATGGTGTTCTCGATGCAGCTGGCCTCCTTGATCAG
Enzyme solution:
0.05 units/μl Q5 “hot start” DNA polymerase
The thermocycler is set to repeat the following temperature shifts in succession. Most of
the period of one cycle is taken up by the time necessary to change temperatures in the
block:
97ºC
30ºC
72ºC
15 seconds
1 second
1 second
(Denaturation of DNA to yield single strands)
(Annealing temperature)
(Optimal synthesis temperature)
Here’s what happens in the assembly reaction:
The 3’ ends of the DNA oligonucleotides fit together like puzzle pieces, by base pairing.
Oligo 1 and 2 base pair:
1. 5’ATGTACTGCGCCCTGAGCACCGCCACCGAGTGCCTGGCCAGCAGC ->
2.
4.
6.
3&4.
5&6.
Purchase answer to see full
attachment