Software
Here are some scripts developed in our lab.
BCGENE scripts
NOTE: You can download everything you need HERE (ZIP 1 kB) (if you're too lazy to individually download the files/scripts below).
Download and install the following
(A) PERL and proxy configuration scripts:
PERL (MSI 10.74 MB)- download and install ActivePERL
proxy.bat (BAT 1 kB)- open a DOS window with proxy settings set up for TCD proxy (needed for make_hapmap_checks.pl)
(B) BCGENE/BCSNPMAX scripts:
(1) make_cardiff_to_common.pl (PL 1 kB) / make_kbio_to_common.pl (PL 10 kB)/ make_snapshot_to_common.pl (PL 11 kB)/ make_taqman_to_common.pl (PL 10 kB) - convert from various genotype formats for further processing. Note; you only need the one that's relevant to the genotyping format of your data
(2) make_common_to_clean.pl (PL 1 kB) - separates the data into association study, hapmap, duplicate/positive control samples and negative control samples
(3) make_forward_strand.pl (PL 12 kB) - converts allele information to forward strand
(4) make_hapmap_checks.pl, version 7 (PL 15 kB) - compares HapMap control SNPs in the data to HapMap and infers error rates
(5) make_local_to_hapmap.pl (PL 7 kB) - takes BC|SNPMAX export file and merges it with downloaded HapMap data
SNP Datafiles and support documentation
schema (PDF 12 kB) - overall summary of the pipeline. Look at it!
README__GENOTYPES_2_BCSNPMAX.txt (TXT 2 kB) - short README file for examples of how to run all the scripts. READ THIS.
See the READMEs below for more detailed descriptions:
README - Bi-allelic SNP Assay Info Files.rtf (RTF 28 kB)
Cardiff_Assay_Infofile.txt (TXT 1 kB)
Custom_Taqman_Assay_Infofile.txt (TXT 1 kB)
Kbiosciences_Assay_Infofile.txt (TXT 1 kB)
Local_to_HapMap_Assay_Infofile.txt (TXT 1 kB)
Pre-designed_Taqman_Assay_Infofile.txt (TXT 1 kB)
RFLP_Assay_Infofile.txt (TXT 1 kB)
Snapshot_Assay_Infofile.txt (TXT 1 kB)
Additional BCGENE files
proxy.bat (BAT 1 kB) - open a DOS window with proxy settings set up for TCD proxy (needed for make_hapmap_checks.pl (PL 15 kB)
run_pipeline.pl (PL 8 kB) - run all the scripts in the TAQMAN/Kbio etc. to BCGENE pipeline
README__GENOTYPES_2_BCSNPMAX.txt (TXT 2 kB) - short README file for examples of how to run all the scripts
Protocol: construction of marker information files for BCGENE
Put SNP id in UCSC genome browser and search. SNP is highlighted in resulting graphical display. If C/T or A/G no problem with strandedness, i.e. if UCSC says strand '-' for an A/G SNP , strand info in marker file is '2'. If UCSC says strand '+' for a C/T SNP, strand info in marker file is '1'.
Other scripts
GeneViewer.pl (PL 1 kB) - view any number of regions along a sequence, e.g. SNPs, exons, transcription factor binding sites. Very handy visualisation tool.
get_hapmap_snps.pl (PL 1 kB) - retrieve genotype data for a list of SNPs for a specified population from HapMap.
cross_ref.pl (TXT 1 kB) - cross-reference a list of SNPs against exons and other genomic landmarks of putative functional importance. Assigns a score to the SNP the more things it overlaps with. Handy for prioritising SNPs most likely to be functional.
get_snp_coordinates.pl (PL 3 kB) - take a file with SNPs as the first column and get coordinates for the SNPs from dbSNP 126 (download might take a while as dbSNP is contained with the script). Output format is snpid, chrom, start, stop, any other info. in your input file. NOTE: you need PERL and Cygwin/UNIX to run the script.
gene_2_genenetwork.pl (PL 1 kB) - link one or more genes in with a protein interaction dataset. GRAPHVIS is included in the zip file, which will allow you to visualise the interaction data.
COIP - Case-only epistasis test version 1.3 (Ricardo Segurado) - UPDATED 17 Feb 2010
Useful datasets
GenBank_2_HUGO_2_snp126.june07 (ZIP 1 kB) - dbSNP126 linked to HUGO gene nomenclature and GenBank ids.
Software
| BC Platform | Genetic and Clinical Data Management Solution | http://www.bcplatforms.com/ |
| Haploview | Haplotype Characterisation And Viewing Tool | http://www.broad.mit.edu/mpg/haploview/ |
| Perl | Scripting Language | http://www.perl.com/ |
| PLINK | Genetic Analysis Package | http://pngu.mgh.harvard.edu/~purcell/plink/ |
| R-Project | Statistical Computing Package | http://www.r-project.org/ |
| Stata | Statistical Analysis Package | http://www.stata.com |
Other useful programs
WGET (ZIP 156 kB) - http retrieval (used by some of the scripts on this page to retrieve, e.g. genotypic information from HapMap).
Cygwin - very cool linux emulator for windows. Note: select "source" when downloading linux components. The mirror cygwin.cict.fr tends to give the least trouble.
perl2exe.exe (EXE 287 kB) - convert PERL scripts into executables.