The nucleotide sequence at the 5´ end of genes can be specified as the sequence of a promoter associated 5´ untranslated region (UTR) together with the initial coding sequence of a gene. Because this genetic region has been implicated in the control of translation, messenger RNA (mRNA) stability and even transcription, it can be looked at as one of the central control points in gene expression. Both the 5´-UTR and the coding sequence have often been included in optimization strategies targeted to simulate recombinant protein production in E. coli and numerous reports describe various sequence-dependent structural features that can positively influence the overall expression process. Nevertheless, the actual mechanisms by which the regulation of gene expression is exerted at the 5´ end remain obscure. The work reported in this thesis has involved various types of analyses of the functionality of the 5´ end, by using mutations as a major tool. The work can be seen as mainly a detailed empirical analysis of the relation between the specific nucleotide sequences at the 5’ end of genes and the final outcome at the protein production level. The results also indicate that optimizations based on empirical laboratory protocols are currently unlikely to be exceeded by predictions based on bioinformatics software.
Sequence mutagenesis of elements in the XylS/Pm - positive regulator/promoter system coupled to high-throughput screening had been previously proven to be a powerful method for increasing the expression of recombinant genes from this expression cassette. At the beginning of this thesis work the effect of introducing random mutations in the DNA sequence of the Pm promoter associated 5´-UTR and two 5´ fusion partners, whose sequences correspond either to a consensus translocation signal peptide or the first 23 codons of a well-expressed celB gene (encoding a cytoplasmic phosphoglucomutase) was investigated. The core of the experimental work was construction of large combinatorial libraries of the different DNA sequences and subsequent selection for improved expression of a reporter gene (either ampicillin or apramycin resistance gene), that was indicated by an increase in antibiotic tolerance of the corresponding E. coli host cells. A shared result of the three individual studies was the establishment of a collection of optimized sequences that generally improved protein production properties of both reporter and industrially relevant heterologous genes.
In addition to random mutagenesis, also synonymous mutations were introduced in the DNA sequence of the consensus signal peptide (CSP) and the consequent expression effects were evaluated. As a conclusion, the DNA changes that did not alter the amino acid sequence led to a lesser stimulation of expression of the bla reporter (ampicillin resistance) than when complete sequence randomization was applied. Moreover, similar results were obtained when synonymous codon usage of the first 9 codons of the medically important ifn-α2b gene was optimized by a bioinformatic method, followed by experimental determination of expression levels of several rationally selected ifn-α2b synonymous variants. These results indicated that optimization of the codon usage of the 5´ coding sequence has limited effects, probably due to the sequence intrinsic characteristics. However, the use of optimized 5´ fusion partners or 5´-UTR variants can often overcome such limitations.
Besides evaluating the expression at the protein level, the work also addressed how the changes of the 5´ end of a gene influence expression at the level of transcript accumulation and mRNA stability. For that purpose, a non-invasive method for accessing recombinant mRNA stability in bacteria was developed. The procedure was based on the removal of diffusible transcriptional inducers followed by qRT-PCR determination of mRNA levels at consecutive time-points. Among the principal findings was that a 5´ fusion partner (specifically: translocation signals pelB and ompA, together with the celB-based 5´ fusion) contributes to the stimulation of recombinant gene expression by enhancing the stability of the corresponding fusion mRNA. The stimulation of expression caused by specific mutations in the 5´-UTR and adjacent coding sequence (synonymous changes), on the other hand, surprisingly appeared to result from improved rate of mRNA synthesis. Three selected promoter systems (Pm, Ptac and the T7 based) were used in these studies, and part of the work also evaluated how fast each system responds to addition and removal of its inducer, respectively. The expression systems were found to affect both transcript accumulation and decay in a specific way that correlated with the type of transcription regulation each system is subjected to.
Finally, a study comparing five bacterial expression systems (XylS/Pm, XylS/Pm ML1-17 (a Pm variant), the bacteriophage T7 RNA polymerase/promoter system, LacI/Ptrc and AraC/PBAD) with respect to their production capacity of five different recombinant proteins was carried out. The comparison revealed many expression system and model gene specific features and that none of the systems was superior in all evaluated aspects; which included system´s adaptability, maximum protein yield, basal expression in the absence of inducer, use of cellular resources and homogeneity of expression. However, particularly because of a large associated collection of optimized genetic elements (such as sequence variants of the Pm promoter, the XylS regulator, 5´-UTR and various translocation signals) and the possibility of simple genetic adjustments that can lead to both higher and lower expression levels, the XylS/Pm system appeared as a good starting point for optimization of various kinds of protein production processes.
NTNU, Trondheim, 2012. , 236 p.
2012-11-16, PFI auditorium, Høgskoleringen 6B, NTNU, Trondheim, 13:15 (English)
Carpousis, Agamemnon, Director of ResearchWinther-Larsen, Hanne Cecilie, Associate Professor