Proteax function listing

The following functions are callable via the Web API. As the API presents a very generic interface to Proteax all parameters must be passed as p1, p2, p3..., not as named parameters.

To see examples of the API usage, please visit the help page.



as_fasta

Protein entry converted to FASTA format. Note that this strips all chemical annotations.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

as_gpmaw

Protein entry converted to GPMAW format.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.
(p2) [output_options] Inline modifications control: Use 'inline-mods=include-all|remove-unused|resolve-to-known' to add or cleanup inline modifications.

as_helm

Protein entry converted to Pfizer HELM notation. EXPERIMENTAL function at present.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

as_molfile

The chemical 2D structure that the supplied protein entry represents. The structure is represented in MDL molfile format (V2000 or V3000 depending on molecule size).

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.
(p2) [molfile_options] Controls molfile generation. Available options are 'expansion-mode=all|condensed|minimal residue-proxy-atom=<atom-symbol> proxy-atom-chain-maxlen=<integer> (>= 3) explicit-hydrogen-mask=<mask> x-proxy-atom=<atom-symbol>'. The default 'expansion-mode' is 'all'.

as_pln

Protein entry converted to PLN (Protein Line Notation) format.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.
(p2) [output_options] PLN output formatting: Use 'residue-format=3-letter' to get 3-letter residue code output. Inline modifications control: Use 'inline-mods=include-all|remove-unused|resolve-to-known' to add or cleanup inline modifications.

as_uniprot

Protein entry converted to UniProt format.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.
(p2) [output_options] Inline modifications control: Use 'inline-mods=include-all|remove-unused|resolve-to-known' to add or cleanup inline modifications.

dernot_applied

Returns a protein derivative produced by applying the DerNot expression to the reference protein.

ParameterDescription
(p1) dernot_expression DerNot text adhering to the Biochemfusion DerNot specification.
(p2) ref_protein_text_or_molfile Reference protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

dernot_diff

Calculates the DerNot expression that will produce the given protein when applied to the reference protein.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.
(p2) ref_protein_text_or_molfile Reference protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.
(p3) [diff_format] DerNot formatting, 1-3 characters: '*' for anonymous expressions, 'L' to force chain locants, 'D' to display deleted residues in des- parts.

dernot_distance

Calculates the distance between two protein entries, expressed as the number of DerNot edits required to get from the given protein to the reference protein.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.
(p2) ref_protein_text_or_molfile Reference protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

format

The file format of the supplied protein entry. If the input is not recognized an error is returned.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

formula

The sum formula of the chemical structure represented by the supplied protein entry or molfile.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

formula_add

Adds two sum formulas.

ParameterDescription
(p1) formula1 Sum formula in normal or GPMAW format.
(p2) formula2 Sum formula in normal or GPMAW format.
(p3) [result_format] Output format: 'N' for Normal or 'G' for GPMAW format [old-style], or 'format=normal|gpmaw collapse-isotopes=on|off'.

formula_element_count

Extracts the number of atoms of a given element within a sum formula.

ParameterDescription
(p1) formula Sum formula in normal or GPMAW format.
(p2) element_symbol Atom symbol of the element whose count you want.

formula_mass_avg

Average molecular weight of the supplied sum formula.

ParameterDescription
(p1) formula Sum formula in normal or GPMAW format.

formula_mass_mono

Mono-isotopic molecular weight of the supplied sum formula.

ParameterDescription
(p1) formula Sum formula in normal or GPMAW format.

formula_mult

Multiplies a sum formula by an integer number. This means that all element counts will be multiplied by the integer number.

ParameterDescription
(p1) formula Sum formula in normal or GPMAW format.
(p2) multiplier Integer multiplier.
(p3) [result_format] Output format: 'N' for Normal or 'G' for GPMAW format [old-style], or 'format=normal|gpmaw collapse-isotopes=on|off'.

formula_sub

Subtracts the second sum formula from the first.

ParameterDescription
(p1) formula1 Sum formula in normal or GPMAW format.
(p2) formula2 Sum formula in normal or GPMAW format.
(p3) [result_format] Output format: 'N' for Normal or 'G' for GPMAW format [old-style], or 'format=normal|gpmaw collapse-isotopes=on|off'.

full_sequence

The full plain sequence of the supplied protein entry, including non-expressed sequence parts.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

full_sequence_mw

The simple average molecular weight of the protein entry sequence. Calculation follows the algorithm as given at http://www.expasy.ch/tools/pi_tool-doc.html.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

id

The ID, if any defined, of the supplied protein entry.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

inchi_key

The InChI key of the supplied protein entry or molecule. If the input is a protein entry, the corresponding full structure will be built and the InChI key calculated for that structure.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

inchi_string

The InChI string of the supplied protein entry or molecule. If the input is a protein entry, the corresponding full structure will be built and the InChI string calculated for that structure.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

inline_mods

Lists all inline modifications in supplied protein entry.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.
(p2) [inline_mod_format] Inline modification output format. Use 'inline-mod-format=sdfile' to output modification data in MDL SD file format. The default output format is as PLN inline-mod properties separated by linefeeds.

list

Lists all terminals and residues of the supplied protein entry. Output is a TAB-delimited table.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

modifications

Lists all modifications in supplied protein entry by name and locant.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

mol_render_info

Produces condensed or full-structure molecule rendering info for the supplied protein entry. If the input is an MDL molfile the molecule is rendered directly from the connection table without further conversion.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.
(p2) [structure_format] 'F'/'C' to generate Full/Condensed structure molecule (default is condensed), or 'M' to force Proteax to parse input as a V2000 molfile. 'n' to add a molecule name label (none displayed per default). 'l' to add residue and terminal name labels for expanded fragments (none displayed per default). 'a<integer>' to add absolute residue numbering or 'r<integer>' to add chain-relative residue numbering (labels are added for every <integer> residues). The 'F', 'C', 'l', 'a', and 'r' format controls are applicable to protein entry input only.

mw_avg

Average molecular weight of the chemical structure represented by the supplied protein entry or molfile. Proteax uses the IUPAC 2007 atomic masses at http://www.chem.qmul.ac.uk/iupac/AtWt/index.html.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

mw_mono

Mono-isotopic molecular weight of the chemical structure represented by the supplied protein entry or molfile. Proteax uses the UniMod masses found at http://www.unimod.org/masses.html.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

name

The name, if any defined, of the supplied protein entry.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

norm_protein

The ordered expressed chains of the supplied protein entry. This is similar to norm_sequence(), except that the output is PLN so the full chemistry is preserved. Having the full chemistry annotations present enables structural comparison.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

norm_protein_chksum

Runs norm_protein() and then returns the MD5 checksum of the output from norm_protein().

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

norm_sequence

The ordered expressed plain-sequence chains of the supplied protein entry. Chains are separated by periods. Cyclic chains are normalized to ensure identical sort order regardless of in-chain rotation.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

norm_sequence_chksum

Runs norm_sequence() and then returns the MD5 checksum of the output from norm_sequence().

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

protein_key

The protein key are the ordered expressed chains of the supplied protein entry, with InChI keys used to represent modified residues. This produces a structurally unique key.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

protein_key_chksum

Runs protein_key() and then returns the MD5 checksum of the output from protein_key().

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

seq_render_info

Produces sequence rendering info for the supplied protein entry.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.
(p2) [residues_per_line] Integer number of residues to display per line.

sequence

The expressed chains of the supplied protein entry. Chains are separated by periods.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

sequence_fingerprint

Feature bitmap fingerprint of sequence for calculating similarity measures.

ParameterDescription
(p1) protein_text_or_molfile Protein entry in UniProt, PLN, GPMAW, FASTA, or MDL molfile format.

tanimoto_score

Calculates the similarity between two feature bitmaps using the tanimoto metric.

ParameterDescription
(p1) fingerprint1 Feature bitmap fingerprint.
(p2) fingerprint2 Feature bitmap fingerprint.

version

Returns the Proteax version string.