Proteomics

Filed Under News 

 

Proteomics and Spinal Fluid

 

Scientists and doctors have a far better understanding of the proteins in healthy spinal fluid, thanks to a U.S., Swedish team who identified 2,630 proteins in the clear fluid that protects the brain and spinal cord. This discovery nearly triples the number of proteins known to exist in spinal fluid. Another striking finding was that slightly more than half of the proteins were not found in blood.

The team was led by Richard D. Smith, Ph.D., of Pacific Northwest National Laboratory, and Steven E. Schutzer, MD, of the University of Medicine and Dentistry of New Jersey—New Jersey Medical School.

In conducting this research, the team used integrated resources at EMSL that included a custom-built automated nanocapillary liquid chromatography system coupled on-line to one of two mass spectrometers, modified in-house with an electrodynamic ion funnel.

 

 

Proteome

 

The proteome is the entire set of proteins expressed by a genome, cell, tissue or organism. More specifically, it is the set of expressed proteins in a given type of cells or an organism at a given time under defined conditions. The term is a portmanteau of proteins and genome.

The term has been applied to several different types of biological systems. A cellular proteome is the collection of proteins found in a particular cell type under a particular set of environmental conditions such as exposure to hormone stimulation. It can also be useful to consider an organism’s complete proteome, which can be conceptualized as the complete set of proteins from all of the various cellular proteomes. This is very roughly the protein equivalent of the genome. The term “proteome” has also been used to refer to the collection of proteins in certain sub-cellular biological systems. For example, all of the proteins in a virus can be called a viral proteome.

The proteome is larger than the genome, especially in eukaryotes, in the sense that there are more proteins than genes. This is due to alternative splicing of genes and post-translational modifications like glycosylation or phosphorylation.

Moreover the proteome has at least two levels of complexity lacking in the genome. While the genome is defined by the sequence of nucleotides, the proteome cannot be limited to the sum of the sequences of the proteins present. Knowledge of the proteome requires knowledge of (1) the structure of the proteins in the proteome and (2) the functional interaction between the proteins.

Proteomics, the study of the proteome, has largely been practiced through the separation of proteins by two dimensional gel electrophoresis. In the first dimension, the proteins are separated by isoelectric focusing, which resolves proteins on the basis of charge. In the second dimension, proteins are separated by molecular weight using SDS-PAGE. The gel is dyed with Coomassie Brilliant Blue or silver to visualize the proteins. Spots on the gel are proteins that have migrated to specific locations.

The mass spectrometer has augmented proteomics. Peptide mass fingerprinting identifies a protein by cleaving it into short peptides and then deduces the protein’s identity by matching the observed peptide masses against a sequence database. Tandem mass spectrometry, on the other hand, can get sequence information from individual peptides by isolating them, colliding them with a non-reactive gas, and then cataloguing the fragment ions produced.

 

 

 

Proteomics – What is Proteomics?

 

PDB structure 2p69, one of the protein structures solved by the New York SGX Research Center for Structural Genomics, a large scale PSI center. This human phosphatase is involved in vitamin B6 metabolism .

 

 

 

Proteomics is the large-scale study of proteins, particularly their structures and functions. Proteins are vital parts of living organisms, as they are the main components of the physiological metabolic pathways of cells.

The term “proteomics” was first coined in 1997 to make an analogy with genomics, the study of the genes. The word “proteome” is a blend of “protein” and “genome”, and was coined by Marc Wilkins in 1994 while working on the concept as a PhD student.

The proteome is the entire complement of proteins, It is now known that mRNA is not always translated into protein, and the amount of protein produced for a given amount of mRNA depends on the gene it is transcribed from and on the current physiological state of the cell. Proteomics confirms the presence of the protein and provides a direct measure of the quantity present.

Scientists are very interested in proteomics because it gives a much better understanding of an organism than genomics. First, the level of transcription of a gene gives only a rough estimate of its level of expression into a protein.

An mRNA produced in abundance may be degraded rapidly or translated inefficiently, resulting in a small amount of protein. Second, as mentioned above many proteins experience post-translational modifications that profoundly affect their activities; for example some proteins are not active until they become phosphorylated.

Methods such as phosphoproteomics and glycoproteomics are used to study post-translational modifications. Third, many transcripts give rise to more than one protein, through alternative splicing or alternative post-translational modifications. Fourth, many proteins form complexes with other proteins or RNA molecules, and only function in the presence of these other molecules. Finally, protein degradation rate plays an important role in protein content.

Post-translational modifications

Not only does the translation from mRNA cause differences, many proteins are also subjected to a wide variety of chemical modifications after translation. A lot of these post-translational modifications are critical to the protein’s function.

Phosphorylation

One such modification is phosphorylation, which happens to many enzymes and structural proteins in the process of cell signaling. The addition of a phosphate to particular amino acids—most commonly serine and threonine mediated by serine/threonine kinases, or more rarely tyrosine mediated by tyrosine kinases—causes a protein to become a target for binding or interacting with a distinct set of other proteins that recognize the phosphorylated domain.

Because protein phosphorylation is one of the most-studied protein modifications many “proteomic” efforts are geared to determining the set of phosphorylated proteins in a particular cell or tissue-type under particular circumstances. This alerts the scientist to the signaling pathways that may be active in that instance.

Ubiquitination

Ubiquitin is a small protein that can be affixed to certain protein substrates by enzymes called E3 ubiquitin ligases. Determining which proteins are poly-ubiquitinated can be helpful in understanding how protein pathways are regulated. This is therefore an additional legitimate “proteomic” study. Similarly, once it is determined what substrates are ubiquitinated by each ligase, determining the set of ligases expressed in a particular cell type will be helpful.

Additional modifications

Listing all the protein modifications that might be studied in a “Proteomics” project would require a discussion of most of biochemistry; therefore, a short list will serve here to illustrate the complexity of the problem.

In addition to phosphorylation and ubiquitination, proteins can be subjected to (among others) methylation, acetylation, glycosylation, oxidation and nitrosylation. Some proteins undergo ALL of these modifications, often in time-dependent combinations, aptly illustrating the potential complexity one has to deal with when studying protein structure and function.

Distinct proteins are made under distinct settings

Even if one is studying a particular cell type, that cell may make different sets of proteins at different times, or under different conditions. Furthermore, as mentioned, any one protein can undergo a wide range of post-translational modifications.

Therefore a “proteomics” study can become quite complex very quickly, even if the object of the study is very restricted. In more ambitious settings, such as when a biomarker for a tumor is sought – when the proteomics scientist is obliged to study sera samples from multiple cancer patients – the amount of complexity that must be dealt with is as great as in any modern biological project.

 

Robotic preparation of MALDI mass spectrometry samples on a sample carrier.

 

 

Complexity of the problem
After genomics, proteomics is considered the next step in the study of biological systems. It is much more complicated than genomics mostly because while an organism’s genome is more or less constant, the proteome differs from cell to cell and from time to time. This is because distinct genes are expressed in distinct cell types. This means that even the basic set of proteins which are produced in a cell needs to be determined.

In the past this was done by mRNA analysis, but this was found not to correlate with protein content. It is now known that mRNA is not always translated into protein,and the amount of protein produced for a given amount of mRNA depends on the gene it is transcribed from and on the current physiological state of the cell. Proteomics confirms the presence of the protein and provides a direct measure of the quantity present.
Post-translational modifications
Not only does the translation from mRNA cause differences, many proteins are also subjected to a wide variety of chemical modifications after translation. Many of these post-translational modifications are critical to the protein’s function.

Phosphorylation

One such modification is phosphorylation, which happens to many enzymes and structural proteins in the process of cell signaling. The addition of a phosphate to particular amino acids—most commonly serine and threonine mediated by serine/threonine kinases, or more rarely tyrosine mediated by tyrosine kinases—causes a protein to become a target for binding or interacting with a distinct set of other proteins that recognize the phosphorylated domain.

Because protein phosphorylation is one of the most-studied protein modifications, many “proteomic” efforts are geared to determining the set of phosphorylated proteins in a particular cell or tissue-type under particular circumstances. This alerts the scientist to the signaling pathways that may be active in that instance.

Ubiquitination

Ubiquitin is a small protein that can be affixed to certain protein substrates by enzymes called E3 ubiquitin ligases. Determining which proteins are poly-ubiquitinated can be helpful in understanding how protein pathways are regulated. This is therefore an additional legitimate “proteomic” study. Similarly, once it is determined which substrates are ubiquitinated by each ligase, determining the set of ligases expressed in a particular cell type will be helpful.

Additional modifications

Listing all the protein modifications that might be studied in a “Proteomics” project would require a discussion of most of biochemistry; therefore, a short list will serve here to illustrate the complexity of the problem. In addition to phosphorylation and ubiquitination, proteins can be subjected to (among others) methylation, acetylation, glycosylation, oxidation and nitrosylation. Some proteins undergo ALL of these modifications, often in time-dependent combinations, aptly illustrating the potential complexity one has to deal with when studying protein structure and function.

Distinct proteins are made under distinct settings

Even if one is studying a particular cell type, that cell may make different sets of proteins at different times, or under different conditions. Furthermore, as mentioned, any one protein can undergo a wide range of post-translational modifications.

Therefore a “proteomics” study can become quite complex very quickly, even if the object of the study is very restricted. In more ambitious settings, such as when a biomarker for a tumor is sought – when the proteomics scientist is obliged to study sera samples from multiple cancer patients – the amount of complexity that must be dealt with is as great as in any modern biological project.

Limitations to genomic study

Scientists are very interested in proteomics because it gives a much better understanding of an organism than genomics. First, the level of transcription of a gene gives only a rough estimate of its level of expression into a protein. An mRNA produced in abundance may be degraded rapidly or translated inefficiently, resulting in a small amount of protein. Second, as mentioned above many proteins experience post-translational modifications that profoundly affect their activities; for example some proteins are not active until they become phosphorylated. Methods such as phosphoproteomics and glycoproteomics are used to study post-translational modifications. Third, many transcripts give rise to more than one protein, through alternative splicing or alternative post-translational modifications. Fourth, many proteins form complexes with other proteins or RNA molecules, and only function in the presence of these other molecules. Finally, protein degradation rate plays an important role in protein content.

 

Methods of studying proteins

 

Determining proteins which are post-translationally modified

One way in which a particular protein can be studied is to develop an antibody which is specific to that modification. For example, there are antibodies which only recognize certain proteins when they are tyrosine-phosphorylated, known as phospho-specific antibodies; also, there are antibodies specific to other modifications. These can be used to determine the set of proteins that have undergone the modification of interest.

For sugar modifications, such as glycosylation of proteins, certain lectins have been discovered which bind sugars. These too can be used.

A more common way to determine post-translational modification of interest is to subject a complex mixture of proteins to electrophoresis in “two-dimensions”, which simply means that the proteins are electrophoresed first in one direction, and then in another, which allows small differences in a protein to be visualized by separating a modified protein from its unmodified form. This methodology is known as “two-dimensional gel electrophoresis“.

Recently, another approach has been developed called PROTOMAP which combines SDS-PAGE with shotgun proteomics to enable detection of changes in gel-migration such as those caused by proteolysis or post translational modification.

Determining the existence of proteins in complex mixtures

Classically, antibodies to particular proteins or to their modified forms have been used in biochemistry and cell biology studies. These are among the most common tools used by practicing biologists today.

For more quantitative determinations of protein amounts, techniques such as ELISAs can be used.

For proteomic study, more recent techniques such as matrix-assisted laser desorption/ionization (MALDI) have been employed for rapid determination of proteins in particular mixtures and increasingly electrospray ionization (ESI).

Computational methods in studying protein biomarkers

Computational predictive models have shown that extensive and diverse feto-maternal protein trafficking occurs during pregnancy and can be readily detected non-invasively in maternal whole blood. This computational approach circumvented a major limitation, the abundance of maternal proteins interfering with the detection of fetal proteins, to fetal proteomic analysis of maternal blood. Computational models can use fetal gene transcripts previously identified in maternal whole blood to create a comprehensive proteomic network of the term neonate. Such work shows that the fetal proteins detected in pregnant woman’s blood originate from a diverse group of tissues and organs from the developing fetus. The proteomic networks contain many biomarkers that are proxies for development and illustrate the potential clinical application of this technology as a way to monitor normal and abnormal fetal development.

An information theoretic framework has also been introduced for biomarker discovery, integrating biofluid and tissue information. This new approach takes advantage of functional synergy between certain biofluids and tissues with the potential for clinically significant findings not possible if tissues and biofluids were considered individually. By conceptualizing tissue-biofluid as information channels, significant biofluid proxies can be identified and then used for guided development of clinical diagnostics. Candidate biomarkers are then predicted based on information transfer criteria across the tissue-biofluid channels. Significant biofluid-tissue relationships can be used to prioritize clinical validation of biomarkers.

Establishing protein–protein interactions

Most proteins function in collaboration with other proteins, and one goal of proteomics is to identify which proteins interact. This is especially useful in determining potential partners in cell signaling cascades.

Several methods are available to probe protein–protein interactions. The traditional method is yeast two-hybrid analysis. New methods include protein microarrays, immunoaffinity chromatography followed by mass spectrometry, dual polarisation interferometry, Microscale Thermophoresis and experimental methods such as phage display and computational methods

 

Practical applications of proteomics

 

An example of a protein structure determined by the Argonne Midwest Center for Structural Genomics  —  deposits 1,000th protein structure

 

 

 

One of the most promising developments to come from the study of human genes and proteins has been the identification of potential new drugs for the treatment of disease. This relies on genome and proteome information to identify proteins associated with a disease, which computer software can then use as targets for new drugs. For example, if a certain protein is implicated in a disease, its 3D structure provides the information to design drugs to interfere with the action of the protein. A molecule that fits the active site of an enzyme, but cannot be released by the enzyme, will inactivate the enzyme. This is the basis of new drug-discovery tools, which aim to find new drugs to inactivate proteins involved in disease. As genetic differences among individuals are found, researchers expect to use these techniques to develop personalized drugs that are more effective for the individual.

Biomarkers

The FDA defines a biomarker as, “A characteristic that is objectively measured and evaluated as an indicator of normal biologic processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention”.

Understanding the proteome, the structure and function of each protein and the complexities of protein–protein interactions will be critical for developing the most effective diagnostic techniques and disease treatments in the future.

An interesting use of proteomics is using specific protein biomarkers to diagnose disease. A number of techniques allow to test for proteins produced during a particular disease, which helps to diagnose the disease quickly. Techniques include western blot, immunohistochemical staining, enzyme linked immunosorbent assay (ELISA) or mass spectrometry.

Proteogenomics

In what is now commonly referred to as proteogenomics, proteomic technologies such as mass spectrometry are used for improving gene annotations. Parallel analysis of the genome and the proteome facilitates discovery of post-translational modifications and proteolytic events, especially when comparing multiple species (comparative proteogenomics).

Current research methodologies

Fluorescence two-dimensional differential gel electrophoresis (2-D DIGE)can be used to quantify variation in the 2-D DIGE process and establish statistically valid thresholds for assigning quantitative changes between samples.

Comparative proteomic analysis can reveal the role of proteins in complex biological systems, including reproduction. For example, treatment with the insecticide triazophos causes an increase in the content of brown planthopper (Nilaparvata lugens (Stål)) male accessory gland proteins (Acps) that can be transferred to females via mating, causing an increase in fecundity (i.e. birth rate) of females.To identify changes in the types of accessory gland proteins (Acps) and reproductive proteins that mated female planthoppers received from male planthoppers, researchers conducted a comparative proteomic analysis of mated N. lugens females. The results indicated that these proteins participate in the reproductive process of N. lugens adult females and males.

Proteome analysis of Arabidopsis peroxisomes has been established as the major unbiased approach for identifying new peroxisomal proteins on a large scale.

There are many approaches to characterizing the human proteome, which is estimated to contain between 20,000 and 25,000 non-redundant proteins. The number of unique protein species likely increase by between 50,000 and 500,000 due to RNA splicing and proteolysis events, and when post-translational modification are also considered, the total number of unique human proteins is estimated to range in the low millions.

In addition, first promising attempts to decipher the proteome of animal tumors have recently been reported.

Comments

Leave a Reply

You must be logged in to post a comment.