Some Molecular Biology Scripts

All should be considered beta versions, and they my require a bit of tweaking to work.
Some take command line arguments, and some you have to edit the script itself.
You can e-mail me (below) with questions, problems, or suggestions. For additional scripts, check our public code repo on bitbucket...

[]convert fasta file with long names to phylip with short names. useful for phyml and raxml. also can concatenate sequences or convert some file formats
[]convert short names generated by back into long names, for example in a tree file. can also edit the lookup table to clean up or annotations to tree names
[]from a fasta file of amino acid sequences, print out a list of calculated values including molecular weight, charge, and percent composition. also search proteins for short motifs
[]read a fasta file, do 6-frame translation, print best protein seqs as fasta
[]mini library required for use with translatedna, seqlite and filterseqs
[]retrieve genbank records from accession numbers
[]perform blast searches on a fasta file, save abbreviated result table (requires biopython and local blast+ installation)
[]Convert Lucid Builder CSV exported files to a NEXUS formatted table
[]from a sequence alignment, return only the variable sites and sequences. (requires, above)
[]from a fasta file, return only the size of each sequence, for making a histogram, etc
[]given an aligned amino acid file and a corresponding unaligned DNA file, insert gaps into the DNA file to create an alignment
[]join together multiple files as one, using *.fta formulation
[]sort through a fasta file, keeping or rejecting sequences with names that contain a certain string. requires, above
[]sort through a fasta file, keeping or rejecting sequences which contain a certain subsequence. search can be specified as regexp: "ATG$". requires, above

Return to Steve Haddock's Home Page     E-mail:     Last modified: Apr. 17, 2013