labblouin.PDBnet module

PDBnet is a collection of Python objects intended to model and contain PDB protein data.

PDBnet Copyright (C) 2012 Christian Blouin Contributions by Alex Safatli and Jose Sergio Hleap Major Refactoring (2014) done by Alex Safatli and Jose Sergio Hleap

E-mail: cblouin@cs.dal.ca, safatli@cs.dal.ca, jshleap@dal.ca Dependencies: Scipy, BioPython, FASTAnet (contained in LabBlouinTools)

class labblouin.PDBnet.PDBatom(serial, name, x, y, z, oc, b, symbol, charge)[source]

Bases: object

ATOM in a PDB protein structure.

DistanceTo(atom)[source]

Acquire the distance from this atom to another atom.

GetPosition()[source]

Get the 3-dimensional coordinate of this atom in space.

charge
fixname()[source]

Ensures the name of this atom fits within 4 characters.

name
occupancy
parent
serial
symbol
tempFactor
x
y
z
class labblouin.PDBnet.PDBchain(name)[source]

Bases: object

A PDB chain (collection of protein residues).

AddIndexOfResidue(index)[source]

Add an index for a residue to this chain. Will generate own residue information if called upon (forces lazy evaluation).

AddResidue(resid)[source]

Add a residue to this chain.

AddResidueByIndex(index)[source]

Add a residue to this chain by its index in the PDB. The residue object will automatically be constructed from the file.

AsFASTA()[source]

Return the string representing this chain’s FASTA sequence.

ContactMap(thres=4.5)[source]

Compute the contact map of this chain.

GetAtoms()[source]

Get all comprising atoms in this chain in order of residues.

GetIndices()[source]

Acquire all of the indices for residues present in this chain.

GetPrimaryPropertiesFromBioPython()[source]

Use BioPython to populate this class with attributes.

GetResidueByIndex(index)[source]

Alias for acquiring a residue from this instance using [] operator.

GetResidues()[source]

Acquire all of the residues in this chain. Note that if this is a large file, this will force the object to load all of these residues into memory from the file.

IterResidues()[source]

Iteratively yield all of the residues in this chain. Note that if this is a large file, this will force the object to load all of these residues into memory from the file.

RemoveResidue(resid)[source]

Remove a residue from this chain.

SortByNumericalIndices()[source]

Sort all internal items by a numerical index.

WriteAsPDB(filename)[source]

Write this single chain to a file as a PDB.

indices
name
parent
pop(s)[source]

Pop a residue.

residues
structure
update(o)[source]

Add residues from another chain.

class labblouin.PDBnet.PDBfile(fi)[source]

Bases: object

chains
close()[source]

Close the file.

fileHandle
filePath
getChainNames()[source]
getModelNames()[source]
getResidueNamesInChain(ch)[source]
hasModels()[source]
hasResidue(chain, res, model=-1)[source]

Determine if a residue is present in the PDBfile.

isLargeFile()[source]

Return whether or on the PDB file this object represents is incredibly large.

iterResidueData()[source]

Yield the model number, chain name, and residue number for all residues present in this PDB file, not necessarily in order.

memHandle
modelIndices
read()[source]

Acquire all remarks, and the indices of all models and residues. Returns remarks, and biological source information as a tuple (remarks, organism, taxid, mutant).

readResidue(chain, res, model=-1)[source]

Parse a residue from the PDB file and return a PDBresidue.

residueIndices
size
class labblouin.PDBnet.PDBmodel(name)[source]

Bases: labblouin.PDBnet.PDBchain

A PDB model (a special kind of chain).

AddChain(chain)[source]

Add a PDBchain (chain) to this model.

AddResidue(resid)[source]

Add a residue to this model.

GetChain(name)[source]
GetChainByName(name)[source]
GetChainNames()[source]
GetChains()[source]
GetResidues()[source]

Acquire all of the residues in this model. Note that if this is a large file, this will force the object to load all of these residues into memory from the file.

IterResidues()[source]

Iteratively yield all of the residues in this model. Note that if this is a large file, this will force the object to load all of these residues into memory from the file.

NewChain(ch)[source]

Create a new PDBchain, add it to this model, and return it.

chainNames
chains
indices
name
residues
structure
class labblouin.PDBnet.PDBresidue(index=None, name='')[source]

A residue (collection of ATOM fields) in a PDB protein structure.

AddAtom(atom)[source]

Add a PDBatom structure to this residue.

Centroid()[source]

Calculate the centroid of this residue. Return this as a PDBatom.

GetAtoms()[source]
GetCA()[source]

Get the alpha-carbon found in this residue as a PDBatom.

InContactWith(other, thres=4.5)[source]

Determine if in contact with another residue.

class labblouin.PDBnet.PDBstructure(filein='')[source]

Bases: object

A PDB protein structure (a collection of chains/models).

AddChain(chainname, chain)[source]

Add a chain as a list of residues to the PDB.

AddModel(modelname, model)[source]

Add a model as a list of residues to the PDB.

AddRemark(remark)[source]

Add a remark (note/comment) to the structure/PDB file.

AddResidueToChain(chain, res)[source]

Add a residue to a chain. Deprecated; use chain class function.

AddResidueToModel(model, res)[source]

Add a residue to a model. Deprecated; use model class function.

ChainAsFASTA(chain)[source]

Return the chain as a FASTA. Deprecated; use chain class function.

CheckComplete()[source]

For every chain, check to see if every residue is complete (see aa_list dictionary).

Contacts(chain=None, thres=4.5)[source]

Compute the contact map of all chains or a chain.

Parameters:
  • chain – A list of chain or model names or a single string or integer. By default, entire structure.
  • thres – A threshold for distinguishing contact in Angstroms.
Returns:

A list of tuples of indices (integers) which correspond to chains or models and their residues.

FDmatrix(fasta, chains=None, scaled=True)[source]

Compute the form difference matrix (FDM) as explained in Claude 2008. It relates to the identification of the most influential residue, with respect to the overall shape/structure. If the scaled option is True, will return an scaled list (from -1 to 1) of the of lenght equal to the number of residues. Otherwise will return the raw FDM, rounded so it can be included in a PDB. The scaled version is better for vizualization. By default the FDM is computed far all chains, but a subset can be passed to the chains option.

GetAllCentroid(chain)[source]

Populates the centroids of all residues.

GetAverage(chains=None, newname=None)[source]

Acquire a new chain or model corresponding to an average of all present chains or models specified.

GetChain(ch)[source]

Get a chain by name.

GetChainNames()[source]
GetFASTAIndices(thing, fst)[source]

Given a PDBchain, find 1-to-1 correspondances between it and a FASTA sequence.

GetModel(mod)[source]

Get a model by name.

GetModelNames()[source]
GetRemarks()[source]

Return all remarks from the PDB as a list of strings.

IndexSeq(chain, fst)[source]

Store in residues the correct index to the fasta. Requires a 1-to-1 correspondance at least a portion of the way through. Deprecated; use GetFASTAIndices().

IterAllResidues()[source]

Produce an iterator to allow one to iterate over all possible residues.

IterResiduesFor(chains=None)[source]

Produce an iterator to allow one to iterate over all residues for a subset of the structure.

Map2Protein(outname, lis, chain, fasta)[source]

Map a list of values (lis), that must have a lenght equal to that of the number of residues in the PDB to be mapped (chain). If a list of list is provided, the first list will be mapped as the beta factor and the second as occupancy

ModelAsFASTA(model)[source]

Return the model as a FASTA. Deprecated; use chain class function.

NewChain(name)[source]

Construct and add a new chain by name to the PDB. Returns the chain.

NewModel(name)[source]

Construct and add a new model by name to the PDB. Returns the model.

RadiusOfGyration(chains=None)[source]

Acquire the radius of the gyration of the entire, or a portion of, the PDB protein molecule.

ReadFile(filename)[source]

Read a PDB file. Populate this PDBstructure.

RemoveChain(name)[source]

Remove a chain from the structure (by name). Returns the chain.

RemoveModel(name)[source]

Remove a model from the structure (by name). Returns the chain.

ViewStructure()[source]
WriteContacts(filename)[source]

Write contact map.

WriteFile(filename)[source]

Write this PDB structure as a single PDB file.

WriteGM(fasta, gm, chains=None, CA=False)[source]

Write the information present in this PDB between multiple chains as a Geometric Morphometric text file. This file will be formatted such that individual lines correspond to chains and semi-colons separate the (x,y,z) coordinates between all homologous residue positions. Requires a FASTA alignment. Options include using alpha-carbon positions. By default, uses centroids of residues.

WriteLandmarks(fasta, lm, chains=None)[source]

Write the information present in this PDB between multiple chains as a landmark text file. This file will be formatted such that the file is partitioned in sections starting with chain names and individual lines in these correspond to homologous residue positions denoted by homologous position, residue number, and residue name tab-delimited. Requires a FASTA file.

chains
contactmap
filepath
gdt(fasta, chains=None, distcutoffs=[1, 2, 4, 8], CA=True)[source]

Get the GDT score between two chains. Requires a FASTA alignment.

gm(fasta, chains=None, CA=False, typeof='str')[source]

Acquire Geometric Morphometric data corresponding to the (x,y,z) coordinates between all homologous residue positions. Requires a FASTA alignment. Options include using alpha-carbon positions. By default, uses centroids of residues. Returns a list of labels and a list of coordinates as raw GM data. The typeof option provides an option for coordinate output; they are returned as a semicolon-delimited string (str) or as a numpy 2d array (matrix).

handle
ismodel
models
mutation
orderofchains
orderofmodels
organism
read(filename)[source]

Alias for ReadFile().

remarks
rmsd(fasta, chains=None, CA=True)[source]

Get the RMSD between chains. Requires a FASTA alignment.

rrmsd(fasta, chains=None, CA=True)[source]

Get the RRMSD between chains. Requires a FASTA alignment. See Betancourt & Skolnick, “Universal Similarity Measure for Comparison Protein Structures”.

taxid
tmscore(fasta, chains=None, native=None, CA=True)[source]

Get the TMscore between two chains. Requires a FASTA alignment and a value for the length of the native structure (e.g., for a pairwise alignment, the length of the structure used as a reference before alignment was done). The latter is computed by assuming the first of both provided chains is the native structure; otherwise, uses a provided chain name (native input).

view(istrajectory=False)[source]

View the structure in a Pymol window. Requires an installation of Pymol.

class labblouin.PDBnet.PDBterminator(chaininst)[source]

Bases: labblouin.PDBnet.PDBatom

A placeholder class that represents a terminating ATOM-like line in the PDB file.

charge
lastatom
lastreschain
lastresind
lastresname
name
occupancy
parent
serial
symbol
tempFactor
x
y
z

Previous topic

labblouin.PDBS2VMDstate module

Next topic

labblouin.RegExpress module

This Page