Date | Topic | Handouts | Remarks |
Wed, Feb 1 | Introduction, Computer anatomy, Setup | Unix cheat | In which students find out course goals and agenda, dissect my old ThinkPad in search of how it works, and learn two key concepts relating Computer Science and Biology: "modularity" and "abstraction". We learn that the difference among image, sound, gene expression and notably code files is in the eye of observer. That brings us to the concept of object-oriented thinking. To conclude, we set up software to move files between filesystems and do remote login to the cluster. |
Fri: Feb 3 | Unix, Shell, secret messages in proteomes. | Emacs cheat | In which we get familiar with the multitude of UNIX controls and commands, learn about the filesystem hierarchy and multi-layered software structure within the Operating System. The power of the Shell transpires as we slice and dice complete proteomes, collecting curious motif statistics. Finally we embark on Mad scientist project: searching for the secret message hidden among these 20-character texts. |
Wed, Feb 8 |
AWK cheat RegEx cheat |
In which we learn about an "orchestra" - super computing cluster, conducted by the orchestra - dispatch machine; the business of computing in a multi-user multi-process environment with its politics, priorities, queues and resource limits. We get introduced to an elegant way of describing complex sets of strings and phrases - regular expressions, and a laconic language for string processing - AWK. | |
Fri: Feb 10 | Using the BioPerl toolbox for automating tasks | code | In which we witness the power of code sharing in a community of rational self-interested biologists. We pick at a few examples of Bio-Perl scripting and dive right into tailoring these to our purposes of automated batch BLASTing and parsing the results. |
Wed, Feb 15 | post-Valentine protein analysis | Rasmol cheat | In which we get a virtual reality tour of protein-DNA complexes, learn the art of selectively "display and paint" of various functional groups and residues, interrogate the proximity and bond info and combine the structural information with BLAST similarity and CLASTAL alignment queries. |
Fri: Feb 17 | Hacking phylogeny with Python | Handout | In which we think about phylogenetic trees as related to the phyletic patterns - absence/presence of orthologous genes across multiple genomes. We get the groups of orthologous proteins using Python scripts and resolve the plausible tree of evolution, while learning to re-use wealth of Bio-Python libraries. |
biological databases and WWWeb | crontab tutorial
curl manual |
In which we learn some more about the Web almighty, revisit the concept of the client-server design and the protocol, learn about command-line HTTP/FTP clients wget and curl, and create an automated updater for exotic genomes and expression data using cron- scheduled task execution under UNIX. | |
Fri: Feb 24 | Algorithms + Data structures, Objects | Matlab | In which we discuss the computational complexity and recursion exemplified by two implementations of the Fibonacci sequence. We learn to do step-by-step break-point debugging and memory interrogation to get to the source of exponential growth and plot the result. Finally we investigate the internal structure of the plot itself with the emphasis on child-parent relationship between the plot window and objects in it, and try to manually and graphically alter some attributes. |
Wed, Mar 1 | Debuger, Clustering analysis in Matlab | In which we rigorously define what it means for a group of objects to be similar to another group of objects, by reducing it in various ways to the singleton similarity. We interactively learn the outcome of hierarchical and K-means clustering for various free-hand drawn problem cases. Finally, we discuss the algorithm and implementation of contour interpolation for tracking a moving cell. | |
BioInformatics toolbox, Shell exchange with Matlab | Cancerdata.xls | In which we see how to run MATLAB in a batch mode on a cluster, invoke MATLAB script from PERL, and vice versa, obtain expression profile data and examine clustering and bi-clustering dendrogram. We revisit the notion of object-oriented design, by obtaining a complex structured PDB record. Finally we learn to profile the MATLAB script to eliminate performance bottlenecks. | |
Wed, Mar 8 | Image and signal processing in Matlab | centrosome.m Image | In which we appreciate the challenge of algorithm design for non-bipartite matching and awesome might of combinatorial explosion, while trying to track markers and particles from one movie frame to the next. We learn about locating and identifying features in an image and explore the concept of image similarity. Finally we decide against trying to startup a new Google which searches the Web for images by description. |
Fri: Mar 10 | Cells segmentation + tracking | ||
Wed, Mar 15 | Final Projects |