Should I use version 1 or 2?
Go with version 1 if you're new to phylogenetics. Version 2 is designed for building extremely large phylogenies; it's not the place to start for a beginner. Version 1 still works great, and is perfect for medium-seized phylogenies. Version 2 is fast, but therefore more inherently dangerous!
Do I have to install all the programs?
No. Only the ones you need, which are Ruby, BioRuby, MAFFT, and one of the phylogenetics programs.
The install script doesn't work for me
Make sure you have given the script permission to run (chmod +x install.sh). Beyond that... I'm not able to give much more help (see below) but I'd suggest you try looking on the websites of each of the programs on their own for more help.
Why isn't there a single-download like there was for phyloGenerator 1
Fundamentally, pG2 is written for people who are not complete beginners. Installing phylogenetics software is something most phylogeneticists have already done, so there's no need for me to bundle up things that you probably already have.
Please do not underestimate the complexity of getting a program like phyloGenerator into a single download. It took almost as long to get that thing working as it did to actually write the program, and maintaining it is an extremely intensive process.
Each program within pG1 was extremely difficult to get working. If you don't believe me... you've never tried something like that. I have committed to maintaining pG1 for as long as I can in that state; I can't even begin to start with a new version.
Why haven't you written an install script for Windows?
Such a thing would be very, very difficult for me (see answer to the above). On a side-note, I've not been on Windows for ~5 years (try Ubuntu, honestly, it's great).
It can't find any sequences, even thought I know they're there on GenBank
The purpose of the checking and cleaning algorithms is to find good sequences. The sequences you've found didn't pass the quality tests!
No, seriously, I can't find any sequences and I know there are good ones
Make sure you've set your alignment length options correctly (see guide). If you set them such that your own reference sequences wouldn't pass, you'll never find anything.
How should I choose my reference sequences? What are good search parameters?
See the guide on this website. If that doesn't answer your question, try the mailing list.
I used the Hawkeye option and now I've got very few good sequences
Consider using the sequences you have now as the references sequences for a new search. Remember, there is also a cache option
This is taking too long
In general, building large phylogenies is a very time-consuming process. It could easily take longer than the sequence downloads, particularly if you're running the whole thing on a laptop. If it's taking much longer than you think it should given the size of the problem, check the quality of your alignment. Bad alignments --> slower phylogeny searches and (most importantly) bad answers.
Why can I only use a constraint tree with RAxML?
...because that's the only program that supports them :D
Where did support for BEAST go?
BEAST is still supported in pG1. Remember that output from pG2 is still useable is pG1, or (and this is probably the best option), you can take your alignments and build your own XML file in BEAST. Maintaing the BEAST XML templating engine is a huge drain on my time for pG1, and I'm keen not to do it twice.
Do I have to edit my constraint tree if not all my species are in there, like in pG1?
No. Just make sure that all the species you're building a tree of are in the constraint tree at the start!
In passing, in pG1 it's quite easy to just write out your sequences, then drop your constraint tree to the species list you now know. I actually intended it to be used like that, and given that pG1 supports dated constraint tree it's actually the only way that makes sense to build a tree (I think).
Is it safe to have beginners building phylogenies "automatically"?
Possibly not; please remember pG2 is not designed for the beginner.
I don't like the idea of automated phylogenies; without user checking something could go wrong
I don't recommend you use pG2 without some checking; it's a tool to speed things up, not to do everything. However, do consider something for a moment: who watches the watchmen? pG2 is reproducible and describes itself perfectly in its code. Manually altering an alignment or constraint tree until it satisfies a condition that you yourself cannot describe is not. I would suggest that is is better to be wrong and be falsifiable (in this context reproducible and describable) than to claim 'correctness' with no means of verification.
What about fossil constraints?
I agree; I want to (at the very least) put treePL in here. More soon, I hope.
I'm really sorry if you're having problems that none of the above can help you with. Please, please, please do get in contact if you're having trouble getting the program to work. I will do everything I can to help you, but I can help you best if you follow the checklist below (in order, I guess):
Simplify your problem to the bare minimum
By working with the simplest example that generates the problem, you make it easier for me to isolate the problem. Please find the closest thing you can to a minimal, complete, verifiable example - 9/10 this process will help you find the solution anyway.
Use the mailing list
I get a lot of emails about pG, and many of them are about the same thing. I set up the mailing list to help with this.
I like talking to people, and I like helping people. If you get in touch with me, I promise I will do everything I can to help you. Don't be put off by a four-point list of things to do; I'm not an ogre, and I will buy you a beer if you get in touch...