Proteax function listing
The following functions are callable via the Web API. As the API presents a very generic interface to Proteax all parameters must be passed as p1, p2, p3..., not as named parameters.
To see examples of the API usage, please visit the help page.
as_fasta
Protein entry converted to FASTA format. Note that this strips all chemical annotations.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
as_gpmaw
Protein entry converted to GPMAW format.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
(p2) [output_options] | Inline modifications control: Use 'inline-mods=include-all|remove-unused|resolve-to-known' to add or cleanup inline modifications. |
as_helm
Protein entry converted to Pistoia Alliance HELM V2 notation.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
(p2) [output_options] | HELM output formatting: Use 'pyl-sec-conversion=off' if U (Sec) and O (Pyl) residues should not be output as [seC] and [Pyl]. |
as_molfile
The chemical 2D structure that the supplied protein entry represents. The structure is represented in MDL molfile format (V2000 or V3000 depending on molecule size).
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
(p2) [molfile_options] | Controls molfile generation. Available options are 'expansion-mode=all|condensed|minimal residue-proxy-atom=<atom-symbol> proxy-atom-chain-maxlen=<integer> (>= 3) explicit-hydrogen-mask=<mask> x-proxy-atom=<atom-symbol>'. The default 'expansion-mode' is 'all'. |
as_pln
Protein entry converted to PLN (Protein Line Notation) format.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
(p2) [output_options] | PLN output formatting: Use 'residue-format=3-letter' to get 3-letter residue code output. Inline modifications control: Use 'inline-mods=include-all|remove-unused|resolve-to-known' to add or cleanup inline modifications. |
as_uniprot
Protein entry converted to UniProt format.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
(p2) [output_options] | Inline modifications control: Use 'inline-mods=include-all|remove-unused|resolve-to-known' to add or cleanup inline modifications. |
dernot_applied
Returns a protein derivative produced by applying the DerNot expression to the reference protein.
Parameter | Description |
---|---|
(p1) dernot_expression | DerNot text adhering to the Biochemfusion DerNot specification. |
(p2) ref_protein_text_or_molfile | Reference protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
dernot_diff
Calculates the DerNot expression that will produce the given protein when applied to the reference protein.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
(p2) ref_protein_text_or_molfile | Reference protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
(p3) [diff_format] | DerNot formatting, 1-3 characters: '*' for anonymous expressions, 'L' to force chain locants, 'D' to display deleted residues in des- parts. |
dernot_distance
Calculates the distance between two protein entries, expressed as the number of DerNot edits required to get from the given protein to the reference protein.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
(p2) ref_protein_text_or_molfile | Reference protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
format
The file format of the supplied protein entry. If the input is not recognized an error is returned.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
formula
The sum formula of the chemical structure represented by the supplied protein entry or molfile.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
formula_add
Adds two sum formulas.
Parameter | Description |
---|---|
(p1) formula1 | Sum formula in normal or GPMAW format. |
(p2) formula2 | Sum formula in normal or GPMAW format. |
(p3) [result_format] | Output format: 'N' for Normal or 'G' for GPMAW format [old-style], or 'format=normal|gpmaw collapse-isotopes=on|off'. |
formula_element_count
Extracts the number of atoms of a given element within a sum formula.
Parameter | Description |
---|---|
(p1) formula | Sum formula in normal or GPMAW format. |
(p2) element_symbol | Atom symbol of the element whose count you want. |
formula_mass_avg
Average molecular weight of the supplied sum formula.
Parameter | Description |
---|---|
(p1) formula | Sum formula in normal or GPMAW format. |
formula_mass_mono
Mono-isotopic molecular weight of the supplied sum formula.
Parameter | Description |
---|---|
(p1) formula | Sum formula in normal or GPMAW format. |
formula_mult
Multiplies a sum formula by an integer number. This means that all element counts will be multiplied by the integer number.
Parameter | Description |
---|---|
(p1) formula | Sum formula in normal or GPMAW format. |
(p2) multiplier | Integer multiplier. |
(p3) [result_format] | Output format: 'N' for Normal or 'G' for GPMAW format [old-style], or 'format=normal|gpmaw collapse-isotopes=on|off'. |
formula_sub
Subtracts the second sum formula from the first.
Parameter | Description |
---|---|
(p1) formula1 | Sum formula in normal or GPMAW format. |
(p2) formula2 | Sum formula in normal or GPMAW format. |
(p3) [result_format] | Output format: 'N' for Normal or 'G' for GPMAW format [old-style], or 'format=normal|gpmaw collapse-isotopes=on|off'. |
full_sequence
The full plain sequence of the supplied protein entry, including non-expressed sequence parts.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
full_sequence_mw
The simple average molecular weight of the protein entry sequence. Calculation follows the algorithm as given at http://www.expasy.ch/tools/pi_tool-doc.html.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
id
The ID, if any defined, of the supplied protein entry.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
inchi_key
The InChI key of the supplied protein entry or molecule. If the input is a protein entry, the corresponding full structure will be built and the InChI key calculated for that structure.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
inchi_string
The InChI string of the supplied protein entry or molecule. If the input is a protein entry, the corresponding full structure will be built and the InChI string calculated for that structure.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
inline_mods
Lists all inline modifications in supplied protein entry.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
(p2) [inline_mod_format] | Inline modification output format. Use 'inline-mod-format=sdfile' to output modification data in MDL SD file format. The default output format is as PLN inline-mod properties separated by linefeeds. |
list
Lists all terminals and residues of the supplied protein entry. Output is a TAB-delimited table.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
(p2) [output_options] | Residue codes are output per default in 3-letter format. You can change this by using 'residue-format=1-letter'. |
modifications
Lists all modifications in supplied protein entry by name and locant.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
mol_render_info
Produces condensed or full-structure molecule rendering info for the supplied protein entry. If the input is an MDL molfile the molecule is rendered directly from the connection table without further conversion.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
(p2) [structure_format] | 'F'/'C' to generate Full/Condensed structure molecule (default is condensed), or 'M' to force Proteax to parse input as a V2000 molfile. 'n' to add a molecule name label (none displayed per default). 'l' to add residue and terminal name labels for expanded fragments (none displayed per default). 'a<integer>' to add absolute residue numbering or 'r<integer>' to add chain-relative residue numbering (labels are added for every <integer> residues). The 'F', 'C', 'l', 'a', and 'r' format controls are applicable to protein entry input only. |
mw_avg
Average molecular weight of the chemical structure represented by the supplied protein entry or molfile. Proteax uses the IUPAC 2007 atomic masses at http://www.chem.qmul.ac.uk/iupac/AtWt/index.html.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
mw_mono
Mono-isotopic molecular weight of the chemical structure represented by the supplied protein entry or molfile. Proteax uses the UniMod masses found at http://www.unimod.org/masses.html.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
name
The name, if any defined, of the supplied protein entry.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
norm_protein
The ordered expressed chains of the supplied protein entry. This is similar to norm_sequence(), except that the output is PLN so the full chemistry is preserved. Having the full chemistry annotations present enables structural comparison.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
norm_protein_chksum
Runs norm_protein() and then returns the MD5 checksum of the output from norm_protein().
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
norm_sequence
The ordered expressed plain-sequence chains of the supplied protein entry. Chains are separated by periods. Cyclic chains are normalized to ensure identical sort order regardless of in-chain rotation.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
norm_sequence_chksum
Runs norm_sequence() and then returns the MD5 checksum of the output from norm_sequence().
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
protein_key
The protein key are the ordered expressed chains of the supplied protein entry, with InChI keys used to represent modified residues. This produces a structurally unique key.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
protein_key_chksum
Runs protein_key() and then returns the MD5 checksum of the output from protein_key().
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
seq_render_info
Produces sequence rendering info for the supplied protein entry.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
(p2) [residues_per_line] | Integer number of residues to display per line. |
sequence
The expressed chains of the supplied protein entry. Chains are separated by periods.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
sequence_fingerprint
Feature bitmap fingerprint of sequence for calculating similarity measures.
Parameter | Description |
---|---|
(p1) protein_text_or_molfile | Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format. |
tanimoto_score
Calculates the similarity between two feature bitmaps using the tanimoto metric.
Parameter | Description |
---|---|
(p1) fingerprint1 | Feature bitmap fingerprint. |
(p2) fingerprint2 | Feature bitmap fingerprint. |
version
Returns the Proteax version string.