Amines, Amino Acids, and AI

Amines

Ammonia, as I mentioned in a blogpost a year ago, has the chemical formula NH3; remove a hydrogen, and you form an amino group, formula NH2. This can go on to bond with an alkyl group - the resultant product is an amine. There are three main categories of amine: primary amines have one alkyl group bonded to the nitrogen atom, secondary amines have two alkyl bonds, tertiary amines have three alkyl bonds.

Amines and ammonia are often similar in properties. Ammonia has one lone electron pair, and the same applies to amines. As a result, both can form hydrogen bonds with water, and are also electron dense (amines slightly more than ammonia since the alkyl group donates electrons to the electronegative nitrogen). Both can act as bases too, with amines reacting with acids to form salts in neutralisation reactions.

Amines are present in various chemicals, including serotonin, a neurotransmitter that regulates emotions and sleep, amongst other functions. They are used in the production of epoxy resins and surfactants, and are prevalent in drug manufacturing, both medicinal - amines are the second most common functional group in the WHO's Essential Medicine List - and recreational (heroin for example).

Primary amine

Biologically, amines play a crucial role too. Five amines act as nitrogenous bases in nucleic acids: adenine, which is paired with thymine (in DNA) or uracil (in RNA), and cytosine, which always pairs with guanine. DNA, found in the cell's nucleus, is the more famous nucleic acid, and contains the genetic information needed for biological processes. RNA is present in the ribosomes, and stores the same information as DNA, but is often used for replicating the data (yet can also carry out varying processes), such as in protein synthesis.

Simply put, genetic information in DNA is transferred into mRNA, a type of RNA used to carry information from the nucleus to the ribosome in a cell. From here, the ribosome will read this information and synthesise a protein from given instructions. It's more complex than that, unsurprisingly, and I haven't mentioned the other forms of RNA yet. 

Amides and polymerisation

Should an amine group replace an OH bond in a carboxyl group, you get an amide. They are a derivative of carboxylic acids, so you might expect it to be easy to go from one to the other. Whilst it is possible, the process is more straightforward if you go via an acyl chloride (functional group COCl).

Amide
Like amines, there are primary, secondary and tertiary variations, and are basic. Their most peculiar property, however, is the ability to form polyamides. These are polymers containing an amide link (formula CONH).

One polymer which contains an amide link is nylon, a tough, elastic fibre often used in fabrics. It's synthesised from a dicarboxylic acid and a diamine; a hydroxyl group from the carboxylic acid will react with a hydrogen from the amine and produce water, and in its place you get an endlessly repeating polymer. Another example is Kevlar, a material used in bulletproof vests, and its monomers - again a dicarboxylic acid and a diamine - contain benzene rings as well. 

Chemguide have more on amides here.

Amino acids

If a compound contains an amine group and a carboxyl group bonded to a carbon atom, it's classified as an amino acid.

Amino acid

To say they're important would be an understatement - they're the repeating units of proteins, lengthy polymers which help sustain life on Earth. The curious thing is that there are only 20 unique amino acids which are coded genetically in humans, yet they contribute to various vital life processes, like protein synthesis and metabolic reactions. 

Out of these twenty, eight can only be obtained in one's diet. In addition to these twenty, there are two other amino acids - selenocysteine, which is essential for the proper functioning of the human body and was the first addition to the genetic code; and pyrrolysine, which isn't coded in humans because it's only found in bacteria. Some of the amino acids are aromatic (like tryptophan or phenylalanine), others contain sulphur (like cysteine) - there is often variation in the molecular structures of the amino acids, as can be seen in this chart.

Notice how an amino acid contains an amine group, which is basic, yet the carboxyl group is acidic. As such, they are amphiprotic, and can act as either an acid or base, depending on certain conditions. This means that the carboxyl group can donate a hydrogen ion - a proton - to the amine group to form a zwitterion, which is a unique type of ion which consists of a positive and negative charge - overall, they cancel out and thus a zwitterion will be chargeless. Amino acids are most commonly found as zwitterions in aqueous solution.

Chirality

A carbon atom bonded to four other atoms is a chiral centre (designed with an *), which are very common across organic chemistry. Aside from glycine, as it doesn't contain an alkyl group, all coded amino acids have a chiral centre, which means that, depending on the orientation of the amino acid, it will have different properties. In other words, these amino acids exhibit optical isomerism - molecules with the same structural formula, but they cannot be superimposed onto each other. I can't wait to lose marks on the A-Level exam by drawing the wrong isomer.

This difference in properties can prove harmful. Though it's not an amino acid, thalidomide is a good example of how one should be careful when handling chiral compounds. One optical isomer is used to treat multiple myeloma, the other notoriously caused birth defects when prescribed in the 1960s to alleviate morning sickness.

Peptides and proteins

Amino acids can join together to form peptides, with the longest of these peptides considered proteins. If multiple peptides bond together, a polypeptide may form, and these chemicals can play vital roles in bodily functions and the endocrine system. Oxytocin, for instance, is a hormone which facilitates childbirth and is commonly considered the "love hormone" - it's composed of nine amino acids arranged as a polypeptide, and was in fact the first polypeptide hormone to be synthesised. 

The possibilities are endless, in fact - all twenty essential amino acids (excluding selenocysteine and pyrrolysine) can be arranged in different ways to form a polypeptide chain - 20n combinations, in fact, for a chain n amino acids long. As a result, you can end up with lengthy strings of amino acids coalescing into a protein - titin, which helps to provide elasticity to muscle, has 34,000 of them. As such, these proteins can be lengthy in name too - titin's full name has nearly 190,000 letters, simply due to all the various amino acid radicals which appear time and time again. 

Titin

A deluge into AI

Since there are that many ways to form a polypeptide, and hence a protein, it can be hard to predict how exactly a protein will fold and thus how it will interact. Luckily, AI has managed to get quite good at predicting how an amino acid sequence will thus decide its structure. 

In 2020, Google DeepMind - an AI research company "solving some of the hardest scientific and engineering challenges of our time" - was able to predict the structures of proteins from these sequences in mere minutes through AlphaFold. In 2022, their database expanded to contain over 200,000,000 different protein structures, spanning animals, plants, bacteria, and fungi. This Nature article from 2021 highlights just how big a role AI is playing in determining these structures, as well as in potentially explaining data generated from X-ray crystallography. 

This year, however, published papers discussed how AlphaFold3 - the far more accurate third generation of the AI - was able to model how proteins fold and function, as well as for other molecules like ligands. All these breakthroughs culminated in two DeepMind researchers winning a Nobel Prize in Chemistry this year.

The best part, however, is that the code is now open source and you can play around with their database using this link. This protein is found in E.coli, a bacteria that causes food poisoning, and you can see that the model is quite confident in its modelling. 

DeepMind say that this research has enabled greater understanding into antibiotic resistance, creating treatments for Parkinson's disease and malaria, as well as creating enzymes that can combat plastic pollution. On the surface, the possibilities truly seem endless, and perhaps this is merely the beginning to AI dominating scientific research - either way, it's hard to not be excited.

Conclusion

It's undeniable that amino acids are essential to life, and it's their seemingly simple structure that makes the resultant peptides and proteins arguably more astonishing. Ultimately, we're only able to function because of how certain molecules fold, which is strange out of context.

This blogpost was definitely one of the more peculiar ones to research, mainly due to how almost every scientific institution loves locking their research papers behind paywalls - I am eternally grateful to LibreTexts. There's also the added fact that AI plays a crucial role in this story, and whilst I've expressed my doubts about AI in the past, I think it's a great benefit to have technology used in this way, not least as the pros outweigh the cons in this case. 

This is also by far my longest blogpost to date, and it also happens to almost entirely cover a subchapter of my A Level course - so maybe it's a revision post done early.

Comments