I have written several software packages that might be of use to others. All the code is licensed under the GPL.
CRITICA (Coding Region Identification Tool Invoking Comparative Analysis) is a microbial gene finder that combines traditional approaches to the problem with a novel comparative analysis. If, in a nucleotide alignment, a pair of ORFs can be found in which the conceptual translated products are more conserved than would be expected from the amount of conservation at the nucleotide level, this is evolutionary evidence that the DNA sequences are protein coding. Regions found by this method are used to generate traditional dicodon frequencies for further analysis. CRITICA thus is not dependent on (often erroneous) sequence annotations, which many other algorithms base their training sets upon, and uses comparative information in a more biologically meaningful way than a simple similarity search. CRITICA was used in the Archeoglobus fulgidus and Aquifex aeolicus genome projects and is still in use by several groups. The algorithm is described in:
and all publicly released versions of the program can be downloaded below (v1.05 is the latest):