New features#
Major#
ClusterCompare
ClusterCompare is a new algorithm for cluster comparison, taking into account presence of biosynthetic profiles, NRPS/PKS module counts and layouts, and gene functions along with the sequence identity and synteny used by the clusterblast module.
The range of scores for any pairing is between 0 and 1, with 1 being a theoretical perfect score. When comparing multiple protoclusters against a single region, the score may be higher and mousing over each protocluster's result will show the 0-1 score for that particular protocluster.
ClusterCompare has been initially released with a comparison database for MIBiG 2.0
(--cc-mibig
), but includes a script for easily generating custom databases
(which can be used with --cc-custom
).
RREfinder
The precision mode of RREfinder was added and can be run with --rre
. This will
detect and annotate RiPP Recognition Elements within RiPP protoclusters.
TIGRfam detection
Annotations based on the TGIRfam database can now added to a run (with --tigrfam
).
These annotations will be present in the genbank output and
also be shown in the gene detail panel on the top right of the HTML output.
Annotation sideloading
A method of running antiSMASH analyses on genomic areas that antiSMASH does not
detect a cluster in is now possible via the use of sideloaded annotations. These
annotations, provided in a JSON format, can contain protoclusters and/or subregions
along with extra details to attach to those regions. One or more sideload external
annotation files with --sideloader
.
Clusterfinder
The clusterfinder module has been removed due to being often misinterpreted or misleading.
Minor#
Analysis:
- a number of additions and changes were made to detected RiPP protoclusters, see the glossary page for details
Output:
- JSON output now includes a simple representation of all detected areas
- output filenames can be customised with
--output-basename
instead of defaulting to the input name
Visualisation:
- Pfam domains for the region are shown in a detail tab similar to the existing NRPS/PKS domains
- Pfam domain hits for a single gene are now included in the gene detail panel of the HTML output
- added a button to download SVG images in the HTML output
Other:
- detection rules can now create and use aliases for groups of profiles
Fixes and small changes#
Input handling:
- improved error messages for invalid inputs
- improved handling of non-standard genbank inputs
Other:
- a workaround has been added for a MacOS specific issue using python 3.8+ with
--cpus
set higher than 1
Numerous other small changes and fixes were made internally, for a full list see the git shortlog.