check here before you email me please!
This paper please! Also, see the extra programs below!
Please see the note below about BEAST. Consider opening 'command prompt', and running the program through that (you can drag-drop 'phyloGenerator.exe' into command prompt and press enter). That way, if the program suddenly exits, you'll be able to see the error messages it generates.
Also, phyloGenerator needs to write out data (like your alignments), and if it can't do that it will crash. You need to have permission to write information wherever you run the program. If you don't have permission, the program will crash with some error like "IOError: [Errno 13] Permission denied: 'temp.fasta'" or something else that looks like it's to do with writing a file out. Your desktop should always be a safe place to put pG if in doubt!
Only cite the programs you use from the following list:
But you must cite BioPython and its Bio.Phylo component. Sorry not to be more help than this, but it's best the authors tell you what to cite so I don't get in trouble!
For reasons that are unclear to me, Windows will let you install phyloGenerator to a place where it can't write out files (I say unclear to me, because you had to write files to copy phyloGenerator in the first place, right?...) This means pG can't pass information to other programs, because it can't write out the sequences it's downloaded for them to use. Picking an output working directory that you don't have permission to save files to will also cause problems! You should always be able to write files to your desktop, so if in doubt put phyloGenerator on your desktop and run it from there.That should fix this problem; I'm afraid I can't offer any advice about what parts of your comupter you have permission to write out files to because I'm not sat at your computer! Sorry!
It uses examples of what 'good' sequences of the genes you're using look like to guide its search. It sorts candidate sequences according to length, attempts to find genes inside those sequences where necessary, and tries to align candidate sequences to your example sequences. If that alignment isn't too long (that's what the 'tolerance' you set does) then that sequence is accepted, and pG moves on to the next sequences.
I'm actively playing around with referenceDownload quite a lot (there is a big trade-off between checking sequences thoroughly and speed), but it has allowed me to make pretty large phylogenies (>1000 spp.) rapidly, and with very little input on my part. I'd strongly advise using this method for large phylogenies, and I'd recommend you just cite the pG paper and mention something along the lines of 'using the VERSION referenceDownload method'. Please let me know what you think!
By default, quite poorly - it's just searching for species names! If that makes you shudder, use the '-taxonIDs' argument to specify particular taxon IDs from GenBank. If you have a way of getting more precise about taxonomy than that, please get in touch with me about it!
While you're here, let me say a few words about taxonomy and GenBank, because a lot of people bring this up. I actually happen to think the GenBank TaxonID system is very good, and does a pretty good job of keeping everything up-to-date and taxonomically reasonable, particularly given it attempts to do so for everything in GenBank! However, I completely agree that taxonomy and name resolution is hard, and I think being careful about these things is important. The advantage pG does have going for it is it's a reproducible method; your mistakes/successes can be described and repeated by stating what you made pG do, and I think that's a useful first step in getting things right.
This isn't a bug, it's a 'feature'. In other words, this doesn't affect your download. Just leave the program running; your sequences are being downloaded and interpreted correctly.
In my experience, it's rare for phyloGenerator not to give you a good result, but bear in mind that no one can write an automated program that works every time with something like this - otherwise there wouldn't be phylogeneticists in the first place! For difficult projects, there's not likely to be a quick solution. Sorry. Please do contact me because I like to know when runs haven't worked. However, there are two major sources of error that phyloGenerator can help with: absence of a constraint tree, or a bad alignment.
Use Phylomatic to make a constraint, take one from a paper - whatever - that way you
Check your alignment. phyloGenerator will warn you (the column marked 'warn') if your alignments seem too long. Output them, open them in a program like Clustal-X2, and see if there are long stretches of gaps, or an alignment that doesn't look like a set of neatly-lined-up sequences. If you find that, go back to the sequence download stage, trim your sequences, check the lengths, and maybe think about using TrimAl in the alignment stage.
Finally, you may find some species are on
Whenever you need to input a file or directory, you'll need to give phyloGenerator the 'absolute path' to the file. Something like '/Users/will/Documents/dna.fasta', or 'C:\Documents and Settings\dna.fasta'. If you're on a Mac or Linux computer, do not use something like '~/Documents/dna.fasta' - only '/Users/will/Documents/dna.fasta' will work. Doing anything else will produce errors during the program, and because pG uses so many other programs it's hard for me to predict exactly when that might happen. Sorry!
First, try installing Cygwin (type 'cygwin' when prompted to go to their website). If that doesn't work, you can try copying a DLL into the 'requires' folder inside the 'phyloGenerator' folder on your computer - again, you can download this when prompted too. If that doesn't work, please contact me. Those two steps have worked for everyone so far, but if they don't for you I want to know!
BEAST uses something called Java. If you're on a Mac, please install the latest version of Java as many users (including me!) cannot run pG with the limited version that ships with MacOS. It seems that some Windows PCs have been configured to have multiple copies of Java on their computer, but by default the oldest version of Java is used. This causes problems for BEAST, and so it crashes. You need to modify your 'PATH' variable so that its first entry is the directory where your latest Java installation is (here is a walkthrough a friend found helpful). Note that you'll probably want to add something like 'C:\Program Files\Java\java16\bin'. A good way of making sure it's this problem is to go into the 'requires' folder inside phyloGenerator, and double-click on 'BEAST v1.7.1.exe'. If that won't run, you need to update your version of Java (try the 'Java' section of Control Panel), and if it does then you need to follow the instructions above. I'm sorry that I can't automate this for you, but if you run into problems do drop me an email. Only one person has had this problem (to my knowledge!) if that's of any consolation!
Press 'control-c' a few times. On a PC, you'll get a dialogue box warning of an error - just hit enter and ignore it. Doing so during a RAxML run will cause problems - see below
You've got a temporary file on your computer that's stopping RAxML running. Search for all files containing 'RAxML_' (note the capitalisation), and remove them if they're anywhere near phyloGenerator or have the word 'temp' in them. Otherwise, just install a fresh copy of phyloGenerator.
Make sure you are using the correct, full, absolute path. For example, 'Demos/Silwood_Plants/sequences.fasta' is an incomplete absolute path, but '/Users/will/Documents/phyloGenerator/Demo/Silwood_Plants/sequences.fasta' is a complete, full path.
Also, if you're dragging and dropping files into phyloGenerator when it asks for them, make sure there are no trailing spaces at the end of your filename. If in doubt, press delete - if the file path doesn't appear to change, you had some trailing spaces.
If your folders or files have spaces in them, most computer programs (including phyloGenerator) run into problems. On a Windows computer, putting the file and its path in double quotes (e.g., "C:\My folder\My file.txt"), and on a Mac/UNIX 'escaping' the space with a backslash (e.g., "/My\ Path/My\ file.txt") will help. If in doubt, drag-and-drop the file into your Terminal/Command Prompt to see how your computer wants the file referenced.
You might have hit a button while the program was running earlier and it remembered that. Try entering the command again, but if you still get problems contact me.
I've set phyloGenerator to pause every ten times it downloads something for five seconds, so as not to overload the NCBI database. However, I'm being massively conservative, and if you want to alter that (at your own risk!) then play around with the 'delay' argument when running phyloGenerator. Many of the programs phyloGenerator calls take quite a while too - if you can hear your computer whirring, it's probably just building your phylogeny.
If you're aligning sequences and it seems to be taking forever, make sure there are no abnormally long sequences in your dataset. If most of your sequences are 1000bp long, and one or two are >10000bp long, you'll need to trim the sequences or you'll crash most alignment programs.
Make sure you have
This is nothing to worry about, unless it crashes phyloGenerator or causes it to exit. If either of those things happen, consider not downloading as many species (more than six hundred seems to cause it problems, and I've not designed this program to handle that many), but please do send me an email about it.
Sorry, you can't. Only one run at a time, as the temporary files phyloGenerator uses can get confused and you can get strange results.
Click on command prompt's icon (top-left of the screen), click properties, then got to layout and change the screen width.
Have a read through the walkthrough. If that doesn't help, drop me an email. I'm afraid I can't explain the whole of phylogenetics to you (sorry!) but I'm quite likely to help you if you send me a polite email. Even more so if you promise me a beer!
I probably agree with you. Please, send me an email and let's talk about it. Maybe we can improve the program, but at the very least I'd appreciate your feedback. For the record, I certainly don't think this program is a replacement for phylogeneticists.
Oh no! Send me a copy-pasted version of everything you did and got back from phyloGenerator, any files you gave to phyloGenerator, and details about your computer (Mac? Windows? What version?). I'm always grateful for feedback (even errors!), and I'll try and help you as soon as I can (I normally reply to emails on the same day they're sent, but remember I'm in the US so time zones may be a factor). If you're on a Windows computer, please try running the program from 'command prompt' first (see above, third FAQ entry) as copy-pasting the error message will really help me help you. Make sure you read all the FAQs above first, though!