Tải bản đầy đủ (.pdf) (78 trang)

Towards automatic gene synthesis with bioinformatics software, novel one step real time PCR assembly, and lab chip gene synthesis 1

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (7.28 MB, 78 trang )


TOWARDS AUTOMATIC GENE SYNTHESIS WITH
BIOINFORMATICS SOFTWARE, NOVEL ONE-STEP
REAL-TIME PCR ASSEMBLY, AND LAB-CHIP
GENE SYNTHESIS










HUANG MO CHAO
(B. Eng.), XJTU










A THESIS SUBMITTED
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
DEPARTMENT OF ELECTRICAL & COMPUTER
ENGINEERING


NATIONAL UNIVERSITY OF SINGAPORE
2009
National University of Singapore
Department of Electrical and Computer Engineering


Page
I
ACKNOWLEDGEMENTS

It is the blessing from my Lord Jesus Christ, who has made this work possible.
I would like to express my sincere thanks to my supervisor Dr. Li Mo-Huang, for his
patience and unfailing guidance. Without his valuable suggestions and support, this work would
not have been successful. Special thanks also go to Associate Professor Adekunle Olusola
Adeyeye, my NUS supervisor, for all his kind help and support during my four years PhD study.
Many thanks should be addressed to Dr. Yang Yi Yan for the valuable suggestions in
developing the hydrogel valve, Dr. Danny Van Noort for the useful comments on the device
design, Fan Lee for providing different hydrogel materials and all my friends in the Institute of
Bioengineering and Nanotechnology for their endless help. Special thanks go to Professor Jackie
Ying and Ms. Noreena AbuBakar for providing me the opportunity to work in IBN and
supporting me all the time.
I am deeply grateful for the various team mates I encountered during my stay at
IBN. I would like to thank Dr. Cheong Wai Chye, Dr. Bode Marcus, Dr Ali Emril
Mohamed, Wei Jiashen, Chua Jay, Muller Stefanie, Kuan Yoke Kong, Ye Hongye, Sim
Choon Kiat, Khor Samuel, Loh Nicholas and the students at high-throughput gene
synthesis group in IBN. In addition to their generous help and strong support, I also
enjoyed great companionship.
My greatest appreciation should go to my parents and grandmother for their endless love
and concern throughout my life. Without them, I would not have made it so far in life. Finally, I
would like to give special thanks to my boyfriend, Brian, who has been supportive, loving and

encouraging me all the time.
This thesis is especially dedicated to all of you.
National University of Singapore
Department of Electrical and Computer Engineering


Page
II
TABLE OF CONTENTS

ACKNOWLEDGEMENTS I
SUMMARY VI
LIST OF FIGURES VIII
LIST OF TABLES XIV


CHAPTER I 1INTRODUCTION 1
1.1 Overview gene synthesis 2
1.2 Challenges in Gene Synthesis 5
1.3 Motivation 6
1.4 Objectives of this PhD thesis 8
1.5 Thesis Outline 10

CHAPTER II GENE SYNTHESIS METHODS 12
2.1 Introduction 12
2.2 Bioinformatics in Gene Synthesis 12
2.3 Biochemical method of gene synthesis 16
2.3.1 LCR based gene assembly 17
2.3.2 PCR based gene assembly 18
2.3.3 Real time PCR 21

2.3.4 DNA extraction and purification 22
2.3.5 Enzymatic error filtering 23
2.3.6 Cloning and sequencing of synthetic DNA 25
2.4 Fundamentals of lab-on-a-chip 27
2.4.1 Microvalves 27
2.4.2 Micromixers 28
2.4.3 On-chip PCR 28
2.4.4 On-chip DNA purification 28

CHAPTER III DESIGN AND OPTIMIZATION OF OLIGONUCLEOTIDES 30
3.1 Introduction 30
3.2 TmPrime oligonucleotide design methods and functional modules 32
National University of Singapore
Department of Electrical and Computer Engineering


Page
III
3.2.1 Fast and flexible oligonucleotide design 32
3.2.2 Multiple-pool assembly 34
3.2.3 Mis-hybridization screening 34
3.2.4 Codon optimization 35
3.3 Experimental evaluation of TmPrime performance 35
3.3.1 Target Proteins 35
3.3.2 Real-time gene assembly and amplification 36
3.3.3 LCR assembly 36
3.4 Results 37
3.4.1 Designing oligonucleotides for target proteins 37
3.4.2 Oligonucleotide assembly and amplification 39
3.4.3 Comparison with existing oligonucleotide design programs 42

3.5 Discussion 43
3.6 Conclusion 45

CHAPTER IV TOPDOWN ONE-STEP GENE SYNTHESIS 46
4.1 Introduction 46
4.2 Principle of Top-Down PCR based gene synthesis 47
4.3 Experiment verification of TopDown one-step gene synthesis 49
4.3.1 Design of oligonucleotide for gene synthesis 49
4.3.2 Non-competition one-step real-time gene synthesis 49
4.3.3 One-step and two-step PCR-based gene synthesis 50
4.3.4 Agarose gel electrophoresis 51
4.4 Performance of TopDown one-step gene synthesis and its real-time analysis 51
4.4.1 Performance of TD one-step gene synthesis 51
4.4.2 Analysis of real-time gene synthesis 54
4.5 Discussion 58
4.6 Conclusion 61

CHAPTER V AUTOMATIC TOUCHDOWN ONE-STEP GENE SYNTHESIS 63
5.1 Introduction 63
5.1.1 Principle of Automatic TouchDown one-step gene synthesis 64
5.1.2 Mechanisms of PCR synthesis process 65
5.2 Experiment verification of Automatic TouchDown one-step gene synthesis 66
National University of Singapore
Department of Electrical and Computer Engineering


Page
IV
5.2.1 Design of oligonucleotides for gene synthesis 66
5.2.2 Automatic TouchDown one-step real-time gene synthesis 66

5.2.3 Gel electrophoresis 67
5.3 Theoretical analysis of DNA hybridization kinetics 68
5.4 Real-time performance study of ATD one-step gene synthesis 70
5.4.1 Effect of varying extension time during ATD one-step gene synthesis
70
5.4.2 Effect of varying initial oligonucleotides concentration 71
5.4.3 Effect of varying annealing temperature 73
5.4.4 Synthesis of long gene by ATD process 74
5.4.5 Effect of varying dNTP concentration 75
5.4.6 Effect of melting temperature uniformity of partitioned
oligonucleotides 75
5.5 Discussion 78

CHAPTER VI INTEGRATED TWO-STEP GENE SYNTHESIS ON CHIP 80
6.1 Introduction 80
6.2 Device fabrication and thermal cycling system construction 81
6.2.1 Microfluidic device fabrication 81
6.2.2 Preparation of hydrogel valves 82
6.2.3 PCR thermal cycling 85
6.3 Experimental verification of the integrated two-step gene synthesis chip 85
6.3.1 Gene assembly and amplification 85
6.3.2 Solid-phase buffer exchange 86
6.3.3 Agarose gel electrophoresis 87
6.3.4 DNA sequencing 87
6.4 Results and discussion 88
6.4.1 Device operation 88
6.4.2 In situ hydrogel valve 92
6.4.3 PCR thermal cycling 93
6.4.4 Comparison of one-step and two-step gene syntheses 95
6.4.5 Thermally enhanced solid-phase PCR purification 99

6.5 Conclusion 102

National University of Singapore
Department of Electrical and Computer Engineering


Page
V
CHAPTER VII CONCLUSIONS AND FUTURE PLAN 104
7.1 Summary 104
7.2 Future work 107
7.2.1 Synthesis of long difficult genes 107
7.2.2 Error filter 107
7.2.3 Integration of real-time fluorescence detection with gene synthesis
system 109
Author’s Publications 110
References 111
Appendix I 120
Appendix I 120
Appendix II 125
Appendix III 126
Appendix IV 135



National University of Singapore
Department of Electrical and Computer Engineering


Page

VI
Towards automatic gene synthesis with bioinformatics software, novel
one-step real-time PCR assembly, and lab-chip gene synthesis

Huang Mo Chao
Under the supervision of Associate Professor Adekunle Olusola Adeyeye
At National University of Singapore and Dr. Li Mo-Huang
At Institute of Bioengineering and Nanotechnology

SUMMARY

This PhD thesis presents the whole process of gene synthesis method development and
optimization, including the development of bioinformatics software TmPrime, TopDown and
Automatic TouchDown one-step gene synthesis methods; and based on the developed protocols,
this thesis also demonstrates an integrated gene synthesis device which is capable to perform two-
step gene synthesis as well as purifying the synthesized product for downstream applications.
Bioinformatics software TmPrime is developed to optimize oligonucleotide design. It is
able to design oligonucleotides with homologous melting temperature for both LCR and gapless
PCR assembly of very long gene sequences. The potential mis-hybridization, hetero-dimer, homo-
dimer and hairpin formations among oligonucleotides are screened by pair-wise sequence
alignment. The utility of TmPrime is demonstrated by synthesizing three genes using gapless one-
step or two-step process.
TopDown (TD) one-step gene synthesis method combines the advantages of one-step and
two-step gene synthesis process. It conducts gene synthesis with TmPrime particularly
designed/partitioned outer primers and inner oligonucleotides with distinct melting temperature
(∆T
m
> 8°C) difference. This particular reaction condition provides several advantages in (i)
eliminating potential competition between the assembly and amplification reactions, (ii)
minimizing the possibility of truncated oligonucleotides participating in the assembly process and

National University of Singapore
Department of Electrical and Computer Engineering


Page
VII
the resulting errors, (iii) providing an stringent annealing condition to reduce the potential of
forming secondary structures, and (iv) increasing the specialization of oligonucleotides
hybridization as in Touchdown PCR. All of these would prevent the generation of faulty sequence,
especially for gene with high GC contents.
Automatic TouchDown (ATD) one-step gene synthesis method is developed to further
improve TopDown method. It enables the synthesis of long DNA of up to 1.5 kbp with only one
polymerase chain reaction (PCR) process. The method involves two key steps: (i) design of outer
primers with two melting temperatures, and (ii) utilization of DNA annealing kinetics to
selectively control the oligonucleotide assembly and full-length template amplification. With the
help of a novel real-time PCR approach to monitor the gene assembly process, the ability of this
ATD method has been demonstrated in the design and synthesis of human protein kinase B-2
(PKB2) (1446 bp) and the promoter of human calcium-binding protein A4 (S100A4) (752 bp)
with oligonucleotides concentration of as low as 1 nM.
The integrated two-step gene synthesis device is established based on the developed
protocols. It is capable of performing two-step gene synthesis to assemble a pool of
oligonucleotides into genes with the desired coding sequence. The device comprises of two
polymerase chain reaction (PCR) modules, temperature-controlled hydrogel valves,
electromagnetic micromixer, shuttle micromixer, volume meters, and magnetic beads based solid-
phase PCR purification module, fabricated using a fast prototyping method without lithography
process. The fabricated device is combined with a miniaturized thermal cycler to perform gene
synthesis. This device has been demonstrated to successfully synthesize a green fluorescent
protein fragment (GFPuv) (760 bp), and obtained comparable synthesis yield and error rate with
experiments conducted in PCR tube within a commercial thermal cycler. To our knowledge, this
is the first microfluidic device demonstrating integrated two-step gene synthesis.




National University of Singapore
Department of Electrical and Computer Engineering


Page
VIII
LIST OF FIGURES
Figure 1.1: Generic gene synthesis process. It takes about two weeks to construct and
deliver an error free DNA 3
Figure 2.1: Flowchart of bioinformatics software. Both protein sequence and DNA
sequence are eligible input files. The program generates the optimized
partition themes of the input sequences regarding user requirements 13
Figure 2.2: Process steps of gene synthesis. Oligonucleotides are synthesized as
building blocks for polymerase cycling assembly or ligase chain reaction.
Synthesized mismatch DNA is filtered out via enzymatic error filtering 16
Figure 2.3: LCR based gene synthesis. (a) Oligos phosphorylation by modifying their
5’ ends from hydroxyl group to phosphate group using a kinase; (b) Oligos
are linked together gradually to form template DNA using thermostable
ligase enzyme 17
Figure 2.4: Operation principle of two-step overlapping polymerase cycling assembly.
Different pools of oligos with sequences partially overlapped are first
assembled to long DNA blocks. Then the outer primers are added to
amplify the amount of assembled full length DNA 19
Figure 2.5: (a) Successive extension polymerase cycling assembly method. DNA is
elongated successively from oligo R5 and F5. (b) Thermodynamically
balanced inside-out polymerase cycling assembly method. DNA
construction starts from inside oligos F1 and R1, and gradually extended

using outside oligos 20
Figure 2.6: Schematic illustrations of non-specific and specific DNA purifications
using (a) ChargeSwitch magnetic beads, (b) streptavidin magnetic beads, (c)
oligo (dT)25 magnetic beads 23
Figure 2.7: Principle steps of MutS error filtering. After re-anneal of assembled DNA,
mismatched heteroduplex DNA are captured by MutS enzyme and
separated from the DNA with correct sequence by gel electrophoresis [60] 24
Figure 2.8: Working principle of enzymatic cleavage proteins. Endonuclease such as
T4E7 and T7E1 recognize and bind to the mutation site of mismatched
DNA and cleave the DNA into two segments 25
Figure 3.1: Scheme of LCR or gapless PCR assembly. The input sequence is the serial
connection of overlap regions of oligonucleotides 32
Figure 3.2: An overview of the oligonucleotide design scheme. The software first
divides the input sequence into approximately equal-temperature (Equi-
Tm ) or equal-length fragments (Equi-space) using markers based on the
user-specified melting temperature. The positions of the markers are
iteratively shifted to globally minimize the deviation in melting
temperature among the fragments (Tm Equilibrate). Two adjacent
fragments are joined together to generate oligonucleotides for gapless
assembly 33
National University of Singapore
Department of Electrical and Computer Engineering


Page
IX
Figure 3.3: Web interface for TmPrime. TmPrime is implemented as functional
modules, each module reflecting a different aspect of the oligonucleotide
design process with interface elements organized in a coherently grouped
fashion 38

Figure 3.4: Base composition plot of gene sequence GC content and melting
temperature plot of overlap regions of oligonucleotides partitioned using
Equi-space approach. (a) PKB2 (1446 bp, G+C: 58.4%). (b) S100A4 (752
bp, G+C: 56%). The GC plot was obtained using Isochore
( The highly
similar profiles of GC and melting temperature plots clearly indicated the
affects of GC cluster on the Tm homogeneity of oligonucleotides 39
Figure 3.5: Agarose gel electrophoresis of assembled products. One-step synthesis of
GFPuv (760 bp) from TmPrime: (Lane 1) optimized and (Lane 2) fixed-
length control oligonucleotides. (Lane 3) One-step synthesis of PKB2
(1446 bp). Two-step synthesis of PKB2: (Lane 4) assembly and (Lane 5)
amplification. (Lane 6) one-step synthesis of S100A4 (752 bp). Two-step
synthesis of S100A4: (Lane 7) assembly and (Lane 8) amplification. The
annealing temperatures for the PCR process are as follow: GFPuv, 50°C;
PKB2, 61°C; S100A4, 58°C (assembly) and 49°C (amplification) 40
Figure 3.6: (a) Melting peak analyses of the assembled products for GFPuv from one-
step synthesis: ( ) optimized and (—) fixed-length control oligos. Melting
peak analyses of the assembled products for (b) PKB2 and (c) S100A4
from one-step and two-step syntheses; two replicas were performed for
each set of oligos. The corresponding agarose gel electrophoresis results of
the assembled products are shown in Figure 3.5. The measured Tm values
are 86.5°C for GFPuv, 91.5°C for PKB2, and 90.5°C for S100A4 41
Figure 3.7: Agarose gel electrophoresis of LCR assembled GFPuv with TmPrime-
optimized oligonucleotides. (a) LCR products (2, 4 and 8 hrs assembly)
before second PCR. (b) Second PCR after LCR (2, 4 and 8 hrs). Lane (L):
100 bp DNA ladder 41
Figure 4.1: Schematic illustration of TopDown one-step gene synthesis combining
PCR assembly and amplification into a single stage with different
annealing temperatures designed for assembly and amplification. Inner
oligonucleotides and outer primers are designed with melting temperature

different > 15ºC to minimize potential interference during PCR 48
Figure 4.2: Agarose gel (1.5 %) electrophoresis results of one-step (30 cycles),
TopDown (TD) one-step (40 cycles), and two-step (PCA: 30 cycles; PCR:
30 cycles) gene syntheses. The TD one-step process is conducted with
annealing temperature of 67 °C for the first 20 cycles followed by another
20 cycles with annealing temperature of 49 °C. The concentrations of
oligonucleotides and outer primers are 10 nM and 400 nM respectively 51
Figure 4.3: Continuous fluorescence monitoring of real-time gene synthesis with 1X
LCGreen I. The first 20 cycles is conducted with annealing temperature of
67 °C followed by another 20 cycles with annealing temperature of 49 °C.
The concentrations of oligonucleotides and outer primers are 10 nM and
400 nM respectively 52
National University of Singapore
Department of Electrical and Computer Engineering


Page
X
Figure 4.4: Concentration effects of SYBR Green I and LCGreen I for TD one-step
real-time gene synthesis of S100A4. (a) 0.25× to 5× SYBR Green I. The
fluorescence intensity of 1× LCGreen I is also included in this plot for
comparison. The fluorescence curves of SYBR Green I are insensitive to
the number of PCR cycles, and fail to indicate the DNA length extension
during gene synthesis. (b) 0.25× to 5× LCGreen I. The annealing
temperatures for assembly and amplification are 58°C and 49°C,
respectively. The concentrations of oligonucleotide and outer primer are 64
nM and 400 nM, respectively 53
Figure 4.5: The MgSO4 concentration is critical for successful gene synthesis. (a)
Fluorescence of 1× LCGreen I as a function of PCR cycle number for
various concentrations of MgSO4: 1.5 mM (◊), 2.5 mM (□), 3.0 mM (∆),

3.5 mM (×), 4.0 mM (●), and 5.0 mM (○). (b) The corresponding agarose
gel electrophoresis results. The TD one-step gene synthesis is conducted
with annealing temperatures of 58°C and 49°C for assembly and
amplification, respectively, 1 mM each of dNTP, 10 nM of
oligonucleotides, and 400 nM of forward and reverse primers. Gene
synthesis with 4 mM of MgSO4 provides the best yield of full-length
product 53
Figure 4.6: The oligonucleotide concentration is critical in the successful gene
synthesis. S100A4 (752 bp) is synthesized with various oligonucleotide
concentrations ranging from 5 nM to 80 nM, and annealing temperatures of
67°C for the first 20 cycles and 49°C for the next 20 cycles. (a)
Fluorescence as a function of PCR cycle number for oligonucleotide
concentrations of 5 nM (◊), 7 nM (□), 10 nM (∆), 13 nM (+), 17 nM (×),
20 nM (○), 40 nM (●), 64 nM (▲), and 80 nM (♦). The slopes of
fluorescence increment in the early cycles and cycles #21 indicate the
efficiencies of the assembly and amplification processes. (b) The
corresponding agarose gel electrophoresis results 55
Figure 4.7: S100A4 (752 bp) is successfully synthesized with various primer
concentrations ranging from 60 nM to 1 µM, as indicated by the sharp,
narrow gel band of the desired length. (a) Fluorescence as a function of
PCR cycle number for outer primer concentrations of 60 nM (◊), 120 nM
(□), 200 nM (∆), 300 nM (×), 400 nM (+), and 1 µM (○). The inset shows
the fluorescence signal of the first 20 cycles. (b) The corresponding
agarose gel electrophoresis results 56
Figure 4.8: S100A4 is synthesized with various assembly cycles (6-20 cycles),
followed by another 20 cycles for amplification. Agarose gel (1.5%)
electrophoresis results indicate full-length assembly is achieved within 11
cycles 57
Figure 4.9: S100A4 (752 bp) synthesized with various assembly annealing
temperatures ranging from 58°C to 70°C for the first 20 cycles, followed

by an annealing temperature of 49°C for the next 20 cycles. (a)
Fluorescence as a function of PRC cycle number for annealing
temperatures of 58°C (◊), 60°C (□), 62°C (∆), 65°C (×), 67°C (+), and
70°C (○). The inset shows the middle 15 cycles (#13–27). (b) The
corresponding agarose gel electrophoresis results. Higher synthesis yield
was obtained with a stringent assembly annealing temperature (> 67°C) 58
National University of Singapore
Department of Electrical and Computer Engineering


Page
XI
Figure 5.1: Schematic illustration of Automatic TouchDown (ATD) one-step gene
synthesis combining PCR assembly and amplification into a single stage.
The melting temperatures of inner oligonucleotides (Tmo) and outer
primers (Tp1 and Tp2) are designed with the conditions of Tp2 ≥ 72°C
and Tmo - Tp1 ≥ 5°C to minimize potential assembly-amplification
interference and maximize the full-length amplification during PCR 64
Figure 5.2: Effect of hybridization reaction time. Top: Agarose gel results of (a)
S100A4-1, (b) S100A4-2, and (c) PKB2 synthesized with: (1) 10-s
annealing (70°C) plus 10-s extension (72°C), and (2) 30-s annealing (70C)
plus 90-s extension (72°C). Bottom: The corresponding fluorescent curves
for S100A4-1 (□: 20 s, ■: 120 s), S100A4-2 (Δ: 20 s, ▲: 120 s), and
PKB2 (○: 20 s, ●: 120 s). The concentrations of oligonucleotides and outer
primers are 10 nM and 400 nM, respectively 71
Figure 5.3: The synthesis yield is dependent on the extension time. S100A4-2 (752 bp)
is synthesized with various extension time from 30 s to 120 s at an
annealing temperature of 70°C (30 s) with oligonucleotide concentration of
(a,c) 10 nM and (b,d) 1 nM. (a, b) Fluorescence as a function of extension
time of 30 s (◊), 60 s (▲), 90 s (♦), and 120 s (□). (c, d) The corresponding

agarose gel electrophoresis results. The synthesis from 10 nM
oligonucleotides reaches the plateau within 30 cycles, while the reaction
from 1 nM oligonucleotides only enters the amplification phase after 30
cycles 72
Figure 5.4: The effect of oligonucleotide concentration on the successful gene
synthesis. S100A4-2 (752 bp) is synthesized with various oligonucleotide
concentrations ranging from 1 nM to 40 nM. All PCR are conducted with
30-s annealing at 70°C and 90-s extension at 72°C. (a) Fluorescence as a
function of PCR cycle number for oligonucleotide concentrations of 1 nM
(□), 5 nM (∆), 10 nM (▲), 15 nM (○), 20 nM (●), and 40 nM (◊). The
change in the slopes of fluorescence increment indicates the emergence of
full-length template. (b) The corresponding agarose gel electrophoresis
results. The arrow indicates the undesired DNA with 2× length of full-
length template, generated from non-specified full-length amplification of
excess PCR 73
Figure 5.5: (a,c) S100A4-2 (752 bp) and (b,d) PKB2 (1446 bp) synthesized with
various annealing temperatures ranging from 58°C to 70°C (30 s) and 90-s
extension at 72°C. (a,b) Fluorescence as a function of PCR cycle number
for annealing temperatures of 58°C (◊), 60°C (∆), 62°C (□), 65°C (♦),
67°C (○), and 70°C (▲). (c,d) The corresponding agarose gel
electrophoresis results. Higher synthesis yield is obtained with a stringent
assembly annealing temperature (70°C). The slope changes in fluorescence
intensity indicate the automatic switch feature in the assembly and
amplification processes 74
Figure 5.6: Agarose gel electrophoresis results of conventional 1-step and ATD one-
step (30-cycle) gene syntheses with dNTPs concentrations of 4 mM and
0.8 mM for (a) S100A4-1 (752 bp), (b) S100A4-2 (752 bp) and (c) PKB2
(1446 bp). All PCRs are conducted with 30-s annealing at 70°C and 90-s
extension at 72°C. The concentrations of oligonucleotides and outer
primers are 10 nM and 400 nM, respectively 76

National University of Singapore
Department of Electrical and Computer Engineering


Page
XII
Figure 5.7: Fluorescent curves of conventional 1-step (▲,♦) and ATD one-step gene
syntheses (Δ,◊) with dNTPs concentration of 4 mM (♦,◊) and 0.8 mM
(▲,Δ) for (a) S100A4-1 (752 bp), (b) S100A4-2 (752 bp), and (c) PKB2
(1446 bp). All PCRs are conducted with 30-s annealing at 70°C and 90-s
extension at 72°C. The concentrations of oligonucleotides and outer
primers are 10 nM and 400 nM, respectively 77
Figure 5.8: Agarose gel electrophoresis results of S100A4-1 (lanes 1 and 3) and
S100A4-2 (lanes 2 and 4) with oligonucleotide concentrations of 10 nM
and 1 nM, and PKB2 (lane 5) with 1 nM oligonucleotides. The arrow
indicates the full-length DNA. Syntheses are performed with 30 and 36
cycles, respectively, for 10 nM and 1 nM oligonucleotides, with 30-s
annealing at 70°C and 90-s extension at 72°C 77
Figure 6.1: Schematic illustration of PCR-based gene synthesis. One-step synthesis
combines PCA and PCR amplification into a single stage. The two-step
synthesis is performed with separate stages for assembly and amplification 81
Figure 6.2: (a) Fabrication process of microfluidic chip. (b) Fabrication process of
hydrogel valve. The PCR reactions and hydrogel valves are controlled by
two separate thermoelectric heaters (TE 1 and TE 2). The insertion shows a
closed hydrogel valve. (c) Photograph of a two-step gene synthesis chip
with solid-phase PCR purification (65 mm × 50 mm) 84
Figure 6.3: (A) Device operation diagram with process time of each step. (B) Detailed
schematic diagrams of each step: (a) Oligonucleotides and PCR mixture
were loaded into PCA chamber (highlighted in red) from A1. PCA was
then conducted. (b) PCA-assembled solution (pumped through B1) was

mixed with fresh PCR mixture containing outer primers (pumped through
A2). The mixed PCR precursor was illustrated in green. (c) Mixed PCR
precursor (green color) was positioned in PCR chamber, and the PCR
amplification was performed. (d) PCR-synthesized product (highlighted in
green) and ChargeSwitch reagent (illustrated in yellow with black dots)
were pumped and loaded into beads chamber. After mixing and incubation
the magnetic beads were captured by a magnet. (e) Magnetic beads were
washed by washing buffer pumped from A5. (f) Elution buffer was loaded
and mixed with magnetic beads, after incubation the magnet was applied to
fix the beads. Synthesis product was eluted into elution buffer and
collected through A7 (highlighted in green) 90
Figure 6.4: (a) Photographs of micromixer. Colored dyes (blue and red) were well
mixed after being shuttled three times between two chambers. (b)
Schematic illustration of the experimental arrangement with a syringe
pump, electromagnetic mixer, thermoelectric heaters and data acquisition 91
Figure 6.5: The thermal response of in situ photopolymerized hydrogel valve. The
valve functions were highly repeatable. The insets showed the transitions
of valve functions 92
Figure 6.6: Thermal cycling profiles of the custom-built PCR thermal cycler. A
thermocouple mounted on the heater was used in the temperature feedback
control (heater temperature) for thermal cycling. The temperature
difference between the heater surface and within the PCR chamber
(chamber temperature) was compensated using a LabVIEW program 94
National University of Singapore
Department of Electrical and Computer Engineering


Page
XIII
Figure 6.7: Agarose gel (1.5%) electrophoresis showing the synthesis yields with

oligonucleotide concentrations of 5–25 nM and outer primer
concentrations of 0.1–0.4 μM for the two-step process. Syntheses were
conducted using a commercial thermal cycler. (a) PCA results. (b) PCR
amplification results 96
Figure 6.8: Agarose gel (1.5%) electrophoresis comparing the synthesis results
conducted within commercial thermal cycler (machine) and microfluidic
device. (a) One-step process (device: single-chamber chip) and (b) two-
step process (device: two-step chip) conducted with an oligonucleotide
concentration of 10 nM and a primer concentration of 0.4 µM 97
Figure 6.9: The effect of elution temperature and incubation time on DNA extraction
conducted within microfluidic device (■: 3 min) and standard PCR tube (□:
3 min; ◊: 2 min) 100
Figure 7.1: Schematic illustration of chip based error filter module. Error enriched
PCR product is pumped through the large inlet of the device, and the
mismatched DNA is captured by the MutS proteins, which are immobilized
on the Ni2+ beads. The error depleted DNA is collect at the small outlet of
the device 109
Figure S1: Scheme of overlapping PCR gene synthesis 135
Figure S2: Calculated annealing possibility distribution of (a) S100A4-1 and (b)
S100A4-2 at oligonucleotide concentration of 1 nM (dash line) and 10 nM
(solid line). Plotted for oligonucleotides with minimum Tm (black line),
maximum Tm (gray line) and average Tm (blue line) 138
Figure S3: The melting temperature versus oligonucleotide concentration plot for
oligonucleotide sets of S100A4-1 (dash line) and S100A4-2 (solid line).
Plotted for oligonucleotides with minimum Tm (black line), maximum Tm
(gray line) and average Tm (blue line). Both oligonucleotide sets contains
more than 30 different oligonucleotides. The slopes of the average Tm
versus the logarithmic oligonucleotide concentration were ~ 1.21 and 1.28
for S100A4-1 and S100A4-2, respectively 138
National University of Singapore

Department of Electrical and Computer Engineering


Page
XIV
LIST OF TABLES
Table 1.1: Previous works of gene synthesis 5
Table 1.2: Gene synthesis companies 6
Table 2.1: Comparisons of the oligonucleotide design features of TmPrime with other
gene synthesis programs 16
Table 2.2: Comparison of different methods of DNA assembly 21
Table 3.1: Data on oligonucleotides 39
Table 3.2: Comparisons of the oligonucleotide design performance of TmPrime with
other gene synthesis programs for S100A4, PKB2, GFPuv and the whole
genome of Poliovirus [1] (Genbank FJ517648; 7418 bp) and øX174
bacteriophage [3] (Genbank J02482; 5386 bp) with oligonucleotide
concentration of 10 nM 42
Table 4.1: Data of oligonucleotide set 49
Table 4.2: PCR conditions for one-step, non-competition (NC) one-step and two-step
gene synthesis 50
Table 4.3: Some reported optimal gene synthesis conditions 61
Table 5.1: Data of oligonucleotide set 66
Table 5.2: Summary of primers for conventional one-step, and ATD one-step gene
syntheses. All PCR assemblies are performed with an annealing
temperature of 70°C 67
Table 6.1: Errors and efficiencies in the synthesis of GFPuv using one-step and two-
step processes in the microfluidic device vs. standard PCR tube (machine) 99
Table A1.1: TmPrime optimized oligonucleotides set designed for the E. coli codon-
optimized GFPuv [1] 120
Table A1.2: Fixed-length oligonucleotides set designed for the E. coli codon-optimized

GFPuv [1] 121
Table A1.3: Oligonucleotides set designed for E. coli codon-optimized PKB2 [2] 122
Table A1.4: Oligonucleotides set designed for S100A4 124
Table A2.1: Oligonucleotides set designed for S100A4 125
Table A3.1: Semi-optimized oligonucleotides set (S100A4-1) designed for S100A4
with oligonucleotide concentration of 10 nM 126
Table A3.2: Optimized oligonucleotides set (S100A4-2) designed for S100A4 with
oligonucleotide concentration of 10 nM 127
National University of Singapore
Department of Electrical and Computer Engineering


Page
XV
Table A3.3: Oligonucleotides set designed for PKB2 with oligonucleotide
concentration of 10 nM 128
Table A3.4: Partial list of potential mishybridizations for SA100A4 gene synthesis
predicted by TmPrime gene synthesis software (.a-
star.edu.sg). The oligonucleotides are alternately displayed in upper and
lower case for ease of finding the oligonucleotide boundaries. Both the
forward and reverse mishybridizations are reported, which have the same
number of matched bases, but may generate different mishybridization
formations during the assembly 130
Table A3.5: Partial list of potential mishybridizations for PKB2 gene synthesis
predicted by TmPrime gene synthesis software (.a-
star.edu.sg) 132
Table A4.1: Summary of melting temperatures of S100A4-1, S100A4-2 and PKB2
oligonucleotide sets at oligonucleotide concentrations of 10 nM and 1 nM 138
National University of Singapore
Department of Electrical and Computer Engineering



Page
1
CHAPTER I
INTRODUCTION
For the last decade, molecular biologists, at large, has focused most of their resources and efforts
on decoding, sequencing and analyzing naturally occurred deoxyribonucleic acids (DNAs). It is
not till the beginning of 21st centuries that attention was switched to the creation of synthetic
biology. This requires the artificial creation of non-natural genes, genomes, proteins, biological
process and organisms. Gene synthesis, an area in molecular biology which utilizes knowledge in
organic chemistry and molecular biology procedures, is a highly efficient technology that is
capable of creating full length genes, operons and even geomomes de-novo
[1]
. This technique,
first demonstrated by Har Gobind Khorana in 1979
[2]
, allows the generation of synthetic genes
without using biological template was conceived as a means of gene acquisition. It also gives
biologists the unique flexibility of considering multiple gene design parameters in parallel. For
example, consideration of codon optimization, suppression schemes on deleterious secondary
DNA structures, and generation of specific restriction sties or motifs can be all taken into
consideration simultaneously. Cello et al
[1]
has successfully utilized this de novo gene synthesis
method to assemble a viral genome of 7.5 kbp in 2002. Likewise, Smith et al
[3]
and Koduma et al
[4]
have demonstrated the assembly of bacteriophage genome of 5.4 kbp in 2003 and a gene cluster

as large as 32 kb in 2004 respectively. The longest synthetic DNA reported to-date is a genome of
Mycoplasma Genitalium of 582 kbp by Venter and co-workers
[5]
in 2008. These remarkable
achievements were the results of meticulous planning with long hours of laborious and repetitive
bench-work with depletion of huge quantity chemicals reagents.
Indeed, gene synthesis has been becoming an enabling technology for many fields of
recombinant or synthetic gene technologies. For instance, synthetic gene could be used for protein
over-expression in heterologous system
[6-8]
, drug/ vaccine development
[9, 10]
, gene therapy and
molecular or protein engineering
[11, 12]
. Gene synthesis technology is also widely used in the study
National University of Singapore
Department of Electrical and Computer Engineering


Page
2
of ancestral genes construction as well as in the development of artificial gene networks and
synthetic genomes
[13, 14]
.
The context and challenges of gene synthesis are complex as they require parallel
attentions on numerous interconnected parameters. The following texts aim to give readers a basic
understanding of genes synthesis, the shortcomings or limitation of various synthesis schemes and
appreciations on the complexity of current genes synthesis methods.

1.1 Overview gene synthesis
In general, generic gene synthesis often employs a “topdown” approach that involves a series of
highly complex processing steps as shown in Figure 1.1. Basically, it includes sequential activities
of (i) pre-synthesis oligonucleotides design, (ii) oligonucleotide synthesis (oligonucleotide
synthesizer) and (iii) gene synthesis (gene synthesizer). To enhance the quality of synthesized
genes, (iv) post-synthesis processes (such as gene purification, error filter) may be required to
stamp-out incorrect/ unwanted gene from final synthesized products.
As clearly illustrated in Figure 1.1, the success of gene synthesis very much lies with the
accuracy in the design of short-oligonucleotides (single-stranded DNA) and predication of correct
synthesis conditions. This is a very demanding task that requires formidable computation power
from bioinformatics software such as DNAWorks
[14]
and Gene Design
[15]
, etc. These
bioinformatics software requires user’s input of DNA text-file of target DNA sequence as well as
other critical synthesis parameters such as oligonucleotides concentrations, outer-primer
concentration, etc. Based on these user-input parameters, the bioinformatics software partitions
the desired gene sequence into short oligonucleotide sequences required by the oligonucleotides
synthesizer. Some software also provides supplementary information such as overlapping sites,
temperature uniformity of partitioned oligonucleotides, possible mishybridization site, etc. It
should be noted that the success of any gene synthesis process lies in balance with the ability of
bioinformatics software in accurately predicating correct synthesis conditions.

National University of Singapore
Department of Electrical and Computer Engineering


Page
3



Figure 1.1: Generic gene synthesis process. It takes about two weeks to construct and
deliver an error free DNA.
The computed short oligonucleotides sequence information is fed into an oligonucleotides
synthesizer where short fragments of single strand nucleic acids with defined sequences are being
synthesized. This is a highly efficient and inexpensive technology in generating specified short
oligonucleotides of desired sequence and length. It is noted that the state-of-art oligonucleotides
synthesizers have the ability to automatically synthesize oligonucleotides up to about 200 bases.
However, to reduce the error rate of final gene product due to synthesis errors introduced during
oligonucleotides synthesizing process, it is common that the partitioned short oligonucleotides are
kept to a range of 15 to 40 bases long for gene synthesis application. To date, oligonucleotides
synthesis has been commonly used to produce antisense oligonucleotides, small interfering RNA,
primers for gene synthesis, and probes for detecting complementary DNA (DNA microarray
technology), etc.
Next, the synthesized short oligonucleotides (~40 to 90 bases) are fed into a gene
synthesizer for subsequent gene assembly (using ligase chain reaction (LCA) or polymerase chain
reaction (PCA)). To increase the amount of the target gene, most protocols include a DNA
amplification process known as polymerase chain reaction (PCR) to amplify the target gene. PCR
National University of Singapore
Department of Electrical and Computer Engineering


Page
4
involves the denaturing and vitro enzymatic replication of target DNA, through the combined
reaction of primers (short oligonucleotides containing complementary sequence to the 5’ ends of
both strands of the target DNA) and DNA polymerases through iterative thermal cycles. In PCR
progress, the molecules of DNA are replicated with the help of DNA polymerases, thus doubling
the number of DNA molecules. Subsequently, each of these molecules is replicated in a second

"cycle" of replication, resulting in four times the number of the original molecules. Again, each of
these molecules is replicated in a third cycle of replication and so on. This technique allows a
single piece of DNA to be exponentially amplified, thus creating millions of copies of the original
DNA. PCR has been extensively modified to perform a wide array of genetic manipulations,
diagnostic tests, and for many other uses.
Before sending the synthesized genes to users for further application, the synthesized
gene may be subjected to a post-synthesis treatment to screen-off unwanted genes from the final
product pool. The separation of target gene from other impurities (such as truncated gene) can be
conducted by extracting the target gene (of desired length) through different ways, such as gel
electrophoresis, magnetic charge switch beads, enzymatic digestion, etc. In general, most gene
synthesis techniques have an error rate about 1 to 5 bases per 1000 bp due to the accumulative
errors throughout all synthesis processes. In most cases, these errors are resultant of poor quality
of short oligonucleotides or synthesis errors occurred during gene assembly and amplification
stages. These errors can be removed by error filtering, which utilizes enzymes to recognize and
capture/digest the mismatched DNA
[16, 17]
. This is a complicated but very important process
which determines the quality of synthesized gene. A well-designed error filtering scheme will
effectively increase the overall yield of gene synthesis as incorrect sequences, such as
mismatches/ mutations/ insertions/ deletions, will be omitted from the produced gene pool. The
development of error filtering in gene synthesis is still at its infancy stage and requires extensive
development effort. The error-filtered genes are then ready to be used for cell free protein
synthesis
[18]
or to be inserted into vectors/ cloned for sequencing before being used for future
applications.
National University of Singapore
Department of Electrical and Computer Engineering



Page
5
1.2 Challenges in Gene Synthesis
Table 1.1 shows the results of various gene synthesis research groups in synthesizing DNA with
different lengths (from 139 bp to 5.38 kbp) using various assembly and amplification protocols. It
is interesting to note that there is no direct correlation of the number of mers of oligonucleotides
with respect to the target length of desired DNA. The parameters stated in the table were obtained
after many iterative experiments. Indeed, successful gene synthesis is often resources taxing as
there is a genuine lack of a standardized protocol for synthesizing genes with various lengths and
sequence complexity.
Table 1.1: Previous works of gene synthesis
Name Target
Length
# of Oligomers Method Error
Rate
Reference
Mycoplasma
genitalium
PCR Gibson, 08
[5]

φX174
bacteriophage
5,386 bp 96 of 42 mers LCR & PCR Smith, 03
[3]

139–1042 bp PCR 1.8 / kb Hoover, 02
[14]

1476 bp 64 0f 40 mers LCR & PCR Chalmers, 01

[19]

Phenylalanine
ammonia-lyase
2.2 kbp 108 of 40 mers PCR Baedeker, 99
[6]

Plasmodium
falciparum
2.1 kbp 104 of 40 mers PCR 3.5 / kb Martinez, 99
[20]

Ornithine
transcarbamylase
1044bp 18 of 70-80 mers

PCR 2.7 / kb Wheeler, 96
[21]

1.1 kb 56 of 40 mers PCR Stemmer, 95
[22]

Bovine
prochymosin
1100 bp 13 of 87 mers LCR 1.9 / kb Wosnick, 87
[23]

Apart from research groups, there are several companies that provide gene synthesis
service. As shown in Table 1.2, the average price of synthesizing one base pair DNA is around
USD 1.20, which means it costs about USD 12,000 to synthesize 10 kb DNA. The delivery time

differs from 2 to 6 weeks or even longer depending on the different length and difficulty of
synthesized DNA.
National University of Singapore
Department of Electrical and Computer Engineering


Page
6
The advancement of DNA synthesis technologies is greatly impeded by its high cost
(~USD 0.85 to USD 1.20 per bp) and long turnaround time, which are mainly attributed by the
costs of manpower and laboratory equipment, large amount of expensive chemicals used as well
as the complex and time-consuming synthesis processes. Gene synthsis requires high accuracy.
Even a single error in the sequence of a synthetic DNA may lead to the total failure of the entire
down stream applications [6-14]. Hence, the main challenges faced by the current gene synthesis
technologies are to develop novel technologies to produce low cost, high fidelity synthetic genes
at fast turnaround.
Table 1.2: Gene synthesis companies
Name

Price (USD/bp)

Delivery (weeks)

DNA 2.0


2
-
4


Epoch Biolabs > $0.85

2-6

GenScript
$1.25

2
-
6

Modular Genetics $2.00

> 2

BlueSky
$1.5

3-5
Biomatik $1.15

> 2

IDT $0.95–1.75

2-6

Blue Heron
$1.25


1.60

> 2

RealGene $1.45 > 2

1.3 Motivation
To achieve our goals in providing low cost, high fidelity synthetic genes at fast turnaround, we
conduct a systematic study on the limitation of current gene synthesis approaches and then
provide our solutions. These include new bioinformatics program for gene synthesis, investigation
of kinetics and mechanisms of PCR-based gene synthesis, novel gene synthesis approach with
ultra-low oligonucleotide concentration, and finally the lab-chip devices to integrate the tedious
gene synthesis process into a chip.
Several bioinformatics programs have been developed, such as DNAWorks
[14]
,
Gene2Oligo
[24]
, GEMS
[25]
and GeneDesign
[15]
. These programs aim to partition gene sequence
into short oligonucleotides with uniform melting temperature, and provide information on
potential mishybridization sites. Some of them also have useful features to divide long DNA into
National University of Singapore
Department of Electrical and Computer Engineering


Page

7
segment as well as codon optimization for heterologous protein expression. However, there is a
lack of gene synthesis program that provides all of these features for long DNA (> 5 kbp) or
multiplex gene syntheses. This prompts us to develop our own bioinformatics software TmPrime,
which provides all of the desired functions.
Several PCR-based gene synthesis methods have been reported include one-step/two-step
overlapping PCR
[8, 22]
, successive PCR
[8]
and thermodynamically balanced inside-out (TBIO)
PCR-based gene synthesis
[26]
. These approaches are developed to optimize the PCR process for
long DNA synthesis, or to enhance the efficiency and accuracy of the synthesis process. The
performances of these approaches are all demonstrated on limited genes (< 5) based on the end-
point gel electrophoresis results. There is a lack of a model that can predict the gene synthesis. In
this thesis we establish an accurate gene synthesis model based on a novel TopDown gene
synthesis method with the help of real time fluorescence monitoring. This is the first time people
combined gene synthesis with real time fluorescence study and clearly revealed kinetics of gene
synthesis process. This model aids in a deep insight of PCR based synthesis process with optimal
reaction conditions.
The production of synthetic gene is to a large extent hampered by its expensive cost
(~USD.0.85 per base pair) with the major expenditure from oligonucleotides (~USD 0.1 per base),
which limits its applications for large scale, systematic studies
[27, 28]
. This can be potentially
solved by gene synthesis with oligonucleotide from DNA microarray which would offer a
significantly reduced cost. Current microchips have very low surface areas and hence only a small
amount of oligonucleotides (0.1 pmol/mm

2
) can be produced
[29]
. Thus, the resulting concentration
of eluted oligonucleotides (<1 nM; 100 mm ×100 mm spot size and 1 ml PCR volume) might be
insufficient for effective hybridization. Moreover, the eluted solution contains thousands of
different oligonucleotides, which increase the complexity of DNA assembly process and the
possibility of mis-hybridization. This prompts us to study the gene synthesis process at ultralow
oligonucleotide concentration (1 nM) with the oligonucleotide quantity matched to DNA
microarray. A novel approach termed Automatic TouchDown gene synthesis is developed, which
combines the gene assembly and gene amplifications into a single PCR step. To our knowledge,
National University of Singapore
Department of Electrical and Computer Engineering


Page
8
this is the first reported method which is able to synthesis relative long DNA from ultra low
oligonucleotide concentration of 1nM.
The cost of man power and laboratory equipment makes up a big portion of the entire
expenditure of the whole synthesis process. A possible solution to reduce these costs is to
integrate the tedious gene synthesis process into a lab-chip device to provide an automatic
microsystem for gene synthesis. Numerous integrated microchip-based PCRs have been
constructed using lab-on-a-chip technologies
[30-33]
. However, most of the reported microPCRs are
designed for genetic analysis, not for gene synthesis purpose. So far there has been only one work
reported by Kong et al.
[34]
, which demonstrated a polydimethylsiloxane (PDMS) device for one-

step PCR gene synthesis. However, this device did not include other steps of gene synthesis
process such as DNA purification and error filtering. In this PhD work, we have established a
two-step gene synthesis microfluidic platform which integrated polymerase cycling assembly,
amplification and DNA extraction module into a single chip using a fast prototyping method
without lithography process. Microfluidic syntheses were successfully attained with low
oligonucleotide concentration of 10 nM and primer concentration of 0.4 µm. The synthesized
products were verified by DNA sequencing and the error rate was comparable to the control
experiments conducted in PCR tube with a commercial thermal cycler. This device would be
useful for constructing a more comprehensive system for fully automated gene synthesis.
1.4 Objectives of this PhD thesis
The primary objective of this thesis is to address the problematic issue in gene synthesis to
synthesize high quality genes in a cost effective manner with short turnaround time. To achieve
this, a “parallel-topdown” approach in defining key synthesis parameters is outmost necessary.
This requires a careful consideration in the design of short oligonucleotides, processing conditions
in gene synthesis process and implementation of error-filter schemes in post-synthesis processing
treatment. In addition, to increase the synthesis efficiency in gene synthesis process, a fully
automated lab-on-chip (LOC) based gene synthesizer will be constructed to validate effectiveness
of using LOC-based technology to reduce the synthesis time. This is very exciting as, to date, no
National University of Singapore
Department of Electrical and Computer Engineering


Page
9
one has demonstrated the whole gene synthesis process in an integrate microfluidic chip. It should
be noted that, in this study, the synthesis of short oligonucleotides is being outsource, hence, its
contribution towards final error-rate in the final synthesized gene could not be studied
numerically.
In my effort in addressing the discussed issues of gene synthesis, the following work-
tasks have been finished:

Constructed new bioinformatics software that can accurately partition short-
oligonucleotides from target DNA (text-file). The new bioinformatics software is capable of
suggesting good synthesis condition based on user-defined parameters. It is able to generate short-
oligonucleotides sequences with high uniformity in melting temperature while suppressing the
generation of deleterious secondary structures. It also includes codon optimization and can advice
user on the generation of specific restriction site to prevent mis-hybridization during gene
assembly process.
Designed and implemented a gene synthesis model that allows the systematic analysis
and studies of kinetics of gene synthesis. This is critical given the current lack of synthesis
kinetics within gene synthesis processes. Vital information can be gathered and fed-back into up-
and down-stream process to optimize gene synthesis performance. The developed protocols are
universal for the synthesizing of genes with different length and complexity at ultra-low
oligonucleotides concentration, and have zero, if not minimum, contribution to the overall error-
rate of synthesized gene.
Designed and constructed an integrated gene synthesis on a microfluidic platform (i.e.,
gene synthesizer) using the lab-on-a-chip technology. This was an extremely challenging task as
gene synthesis even with current level of technology. To ensure the success of on-chip gene
synthesis, new synthesis protocols were developed as the synthesis kinetics is very much different
between a microfluidic and a bench platform. In addition, a new microfluidic system was
developed to facilitate the precise metering, mixing, pumping, isolating, positioning and
transporting of fluids in the integrated chip. Device material was also carefully evaluated to
prevent contaminations which result gene synthesis failure.

×