Table of Contents

# Accepted Point Mutations (PAM)

PAM was developed by Margaret Day Hoff et al in 1966 and 1978. It explains the rules by which evolutionary chance occurs in proteins. Their approach was to catalog hundreds of proteins and compare the sequence of closely related proteins in many families. The model determines the specific amino acids substitutions observed to occur when two homologous sequences are aligned.

It involves phylogenetic analysis where they were comparing amino acids to the common ancestry of their sequences instead of two amino acids directly. Day Hoff et al defined accepted point mutation as replacement of one amino acid in a protein by another residue that has been accepted by natural selection. Are you looking for Accepted Point Mutation Assignment help? Worry no more! We got you covered!

## Assumption of Accepted Point Mutation

An amino acid change that is accepted by natural selection can occur when: A gene undergoes a DNA mutation such that it encodes a different amino acid. It is referred to as constructive replacement. The replacement does not alter the function of the amino acid in the body. The second assumption is that the entre species adopts that changes as predominant form of the protein.

## Steps in development of Accepted Point Mutation.

### Step one: selection of data set

Model development involved examining of 34 closely related proteins in super families grouped into 71 groups (phylogenetic trees) contain 1572 changes. For example, beta globin, alpha goblin, and myoglobin. Different protein families were studied. The numbers are the rates of accepted point mutation in each protein family. Some families evolve faster than others. For example, Histones family less than one mutation is accepted in 100 million years compared to immunoglobin which accepts up to 37 mutations. Some of the protein families accept less mutation since altering the structure of amino acid will also interfere with the function of that amino acid in the body. This may cause abnormality or harm to the organism.

### Step two: Determining the frequency of Amino Acid occurrence.

Normalized frequencies of the amino acids are obtained in this step. The values are indicated in a table and they all sum up to 1. If the twenty amino acids were equally represented in proteins, these values would be al 0.05. Instead, they vary in frequency occurrence. For example, frequency for glycine is 0.089, alanine is 0.087, arginine is 0.041, among others.

### Step three: Calculation of relative mutability of the amino acid

Relative mutability refers to how often each amino acid is likely to change over a short evolutionary period. It is calculated using the observed frequencies of the amino acids and the amino acid observed to mutate. Why are some amino acids more mutable than others? The less mutable residues probably have important structure or functional roes in proteins, such that the consequence of replacing them could be harmful to the organism.

### Step four: generate a mutation probability matrix M

It is obtaining from using accepted mutation data and probabilities of occurrence of each amino acid. Each element of the matrix M_{ij }shows the probability that the original amino acid j will be replaced by another amino acid i over a defined time interval.

PAM1 is used in generating the probability matrix M. It refers to an evolutionary divergence of one percent of the amino acid have been changed between two protein sequence. It also indicates that there is a 9 percent identity between amino acid j and i.

Non diagonal elements in the matrix are obtained by dividing element of accepted point mutation such as the value corresponding to alanine (A_{ij}) multiplied by proportion constant (k) and Mutability of the j ^{th} amino acid (m_{j})by A_{ij}.

M_{ij}= Km_{j} *A_{ij}/ A_{ij}

Diagonal elements are obtained by subtraction of multiplication of the proportional constant(k) and mutability constant of the j ^{th} element (m_{j}) from 1.

M_{ij}= 1- (K* m_{j})

Mij refers to the probability that the original amin acid j will remain j without undergoing a substitution to another amino acid.

#### Mutation probabilities

For each original amino acid, it is easy to observe the amin access that are more likely to replace it if change should occur. These data are very relevant to pairwise sequence alignment because they will form the basis of a scoring system as described in the Day Hoff model. Reasonable amino acid substitution is rewarded while unlikely substitutions are highly penalized.

### Step five: Generation of other Accepted point mutation matrices

PAM1 is based on one percent change in observed alignments of closely related sequences with 99 percent identity. Other pam matrices from fast diverging families or distantly related families can be calculated using PAM1 as basis. In the day off model, these matrices where calculated by multiplying PAM1 matrix with itself up to infinite number of times. For example, PAM250

#### PAM 250

It is produced when the PAM1 matrix is multiplied by itself 250 times It is one of the common matrixes used for BLAST searches of databases. This matrix applies to an evolutionary distance where the proteins share about twenty percent identity.

### Step six: Determination of relatedness odds matrix from a mutation probability matrix

Day Hoff et al defined relatedness odds matrix as the elements in M_{ij} of any given mutation probability matrix and the probability that amino acid j will change to i in a homologous sequence.

For relatedness odds matrix, a value of R_{ij} of 1 means that the substitution occurs as often as expected by chance. A value of R_{ij} less than 1 means that the substitution occurs less than as expected by chance. Substitution is therefore unfavored. A value of R_{ij} more than 1 means that the substitution occurs more than as expected by chance.

F_{i }a comparison of two proteins, it is necessary to determine the values of R_{ij} at each aligned position then multiply the resulting probabilities to achieve the overall score for an alignment.

R_{ij} refers to the relatedness of the odds matrix. M_{ij} is the observed frequency of substitution for each pair of amino acids obtained from probability matrix. F_{i} is independent, background probability of replacement of amino acid I occurring in this position.

R_{ij}= M_{ij}/ F_{i}

### Step seven: Log Odds Scoring matrix

Logarithm form of the relatedness of odds matrix is known as log odds scoring matrix. The cells of log odds matrix consist of scores for aligning any two residues (including an amino acid and itself) along the length of pairwise alignment.

Taking algorithms is more convenient because it allows the summation of the scores of the aligned residues when performing overall alignment of two sequences. Failure to take the logarithms, will contribute to multiplication of the ratios at the aligned position which is computationally cumbersome.

#### Calculating a substitution score C to L

The formula for log relatedness odds matrix is used. The PAM 250 mutation probability matrix value is 0.02 and the normalized frequency of leucine is 0.085.

S_{ij }= 10* log 10 (m_{ij}/ f_{i})

S (cytosine, leucine) = 10* log 10 (0.02/0.085) = 6.3

High scores in log odds matrix PAM 250 indicate favored substitutions while low scores indicate unfavored substitutions thus high penalties.

## Differences between Accepted Point Mutation (PAM) and BLOSUM

PAM uses closely related protein sequences. For example, PAM 1 with 99 percent identity between the amino acids. BLOSUM focuses on distantly related proteins.

PAM assumes substitution probabilities of closely related sequence such as PAM 1 can be extrapolated to distantly related sequences like PAM 250. In BLOSUM, distantly related sequences are obtained directly from blocks or alignments of protein families regardless of the evolutionary distance.

PAM mostly based on global alignment. This indicates sequences are compared with other sequences instead of a portion of the sequence. BLOSUM is mainly based on local alignments whereby segments of sequences are compared with other blocks of another sequence.

PAM calculated from a comparison of sequences with no more than one percent divergence corresponding to 99% identity. In BLOSUM62, it is calculated from sequences with 62% identity.

In distantly related sequences, PAM with high numbers are used to show high level of divergence. BLOSUM uses low numbers in distantly related sequences. For example, BLOSUM10 indicates only 10 per cent identity and 90 per cent divergence.

In closely related sequences, PAM with low numbers are used to show high level of identity. For example, PAM1 which indicates 1 per cent divergence and 99 per cent identity. BLOSUM uses high numbers in distantly related sequences. For example, BLOSUM90 indicates only 10 per cent divergence and 90 per cent identity.

PAM uses log odds ratio to base of 10 unlike other scoring matrixes like BLOSUM which uses log odds ratio to the base of 2.

## Why should you hire us?

Our services are available 24 x7 to ensure convenience. Our experts are efficient in handling the given assignment as per the instructions. They also meet the deadlines and deliver high quality work.