The BpForms Grammar extends the IUPAC/IUBMB notation commonly used to represent unmodified DNA, RNA, and proteins to describe non-canonical forms of DNA, RNA, and proteins. Features include the representation of a wider range of monomeric forms, including monomeric forms that are not described in pre-defined alphabets; left and right caps such as 5' caps; intrastrand crosslinks (additional bonds between non-adjacent monomeric forms); nicks (absence of a bond between adjacent monomeric forms); and linear and circular topologies of polymers. BpForms has concrete semantics for generating molecular structures from its compressed representation of sequences of monomeric forms. The BpForms grammar is defined in Lark syntax , which is based on EBNF syntax
dna protein rna polypeptide region