HLA Nomenclature

A new HLA nomenclature was introduced in April 2010, replacing a system which had been in use since the 1990’s. The main drive for the change was that the old system could no longer accommodate the increasing number of HLA alleles that were being described. There are currently over 5,700 alleles described across all the classical and non classical HLA loci.


The old system was based on assigning significance to pairs of digits in the allele nomenclature. For example in the allele HLA-A*02010102L, the designation ‘HLA’ identifies the allele as a HLA allele. The dash (-) separates the HLA designation from the gene, in this case the ‘A’ gene. The ‘*’ is a separator. Of the actual allele name, the first two digits (02010102L) represents the allele group and in most instances, was synonymous with the Serological type (A2 in this case). The third and fourth digits (02010102L) identified the specific allele. All alleles whose nomenclature differed in these first four positions (02010102L) must code for proteins with different sequences. Alleles whose nomenclature differed in the fifth and sixth position (02010102L) code for proteins with silent mutations within the coding sequences. Sequences which differed by mutations in the introns or in the untranslated regions flanking the 3’ and 5’ ends of the exons were identified by different digits in the seventh and eighth positions (02010102L). In addition, a number of suffices were used to identify sequences that were null, i.e. not expressed (N), those that had low expression (L), those that were secreted (S), those found only in the cytoplasm (C), those with questionable expression (Q) and those with aberrant expression (A).


A key limitation of this old system was that it only allowed for up to 99 alleles which differ in any of the pairs of positions. The HLA-A*02 and B*15 allele groups were the first to run into this problem when more than 99 alleles of were detected for them. At that time, the WHO Nomenclature Committee for factor of the HLA system decided to adopt the rollover sequences A*92 and B*95 respectively for A*02 and B*15. When A*0299 was identified, the next A*02 allele described was named A*9201. Similarly when B*1599 was identified the next B*15 allele described was named B*9501. Recently however, a number of other HLA types started to fast approach 99 alleles. These include A*03, B*40, B*44 and DRB1*11. Adopting rollover sequences for all of these was impractical. A rollover system of sorts had already been adopted for HLA-DPB1. When HLA-DPB1*9901 was identified, the next HLA-DPB1 allele was named ‘within the existing sequences’ as HLA-DPB1*0102.


In 2010, a new nomenclature system was adopted. This introduced colons ‘:’ as separators between pairs of digits. HLA-A*02010102L therefore became HLA-A*02:01:01:02L. The pairs of digits separated by colons are known as Fields. The first and second digits of the old nomenclature form the 1st Field of the new nomenclature. The third and forth digits of the old nomenclature form the 2nd Field of the new nomenclature. To help reduce confusion in adopting the new nomenclature, the leading ‘0’ in alleles 1-9 of each allele group was kept.


The introduction of the colons means that each Field is no longer restricted to 99 digits but can be expanded limitlessly. Once HLA-A*03:99 was identified, the next A3 allele could be named HLA-A*03:100.


With the introduction of colons and therefore the removal of the artificial restriction of 99 digits, there is no more need for rollover sequences. HLA-A*92 and B*95 were renamed A*02 and B*15 respectively and their associated alleles remapped. A*9201 became A*02:101. A*9202 became A*02:102 etc. HLA-B*9501 became B*15:101. HLA-B*9502 became B*15:102 etc. HLA-A*02:100 and B*15:100 were not used to help make the remapping easier. However other HLA types which exceed 99 alleles will use allele 100. HLA-DPB1 alleles were also remapped. HLADPB1*0102 became HLA-DPB1*100:01.


A number of other changes were made to the nomenclature. The ‘w’ was dropped from HLA-Cw alleles but not from Cw antigens. HLA-Cw*0102 became HLA-C*01:02. The ‘w’ was kept in antigen names to avoid confusion with complement factors as well as with KIR ligand groups. For ambiguous allele strings, the codes ‘P’ and ‘G’ were introduced. A group of alleles that share the same nucleotide sequences within exons 2 and 3 for HLA class I and exon 2 for HLA class II were named after the first allele in the sequence and given a code of ‘G’ as a suffix. E.g. HLA-A*02:01:01 and HLA-A*02:01:02 could be named HLA-A*02:01G. A group of alleles that share the same protein sequences in the α2 and α3 domains, irrespective of the nucleotide sequence differences could be named after the first allele in the sequence and given a code of ‘P’ as a suffix e.g. HLA-A*02:01:01P.


