snpio.plotting.plotting.Plotting
- class snpio.plotting.plotting.Plotting(genotype_data, show=False, plot_format='png', dpi=300, plot_fontsize=18, plot_title_fontsize=22, despine=True, verbose=False, debug=False)[source]
Class containing various methods for generating plots based on genotype data.
This class is initialized with a GenotypeData object containing necessary data. The class attributes are set based on the provided values, the GenotypeData object, or default values.
- genotype_data
Initialized GenotypeData object containing necessary data.
- Type:
GenotypeData
- prefix
Prefix string for output directories and files.
- Type:
str
- output_dir
Output directory for saving plots.
- Type:
Path
- show
Whether to display the plots.
- Type:
bool
- plot_format
Format in which to save the plots.
- Type:
str
- dpi
Resolution of the saved plots.
- Type:
int
- plot_fontsize
Font size for the plot labels.
- Type:
int
- plot_title_fontsize
Font size for the plot titles.
- Type:
int
- despine
Whether to remove the top and right plot axis spines.
- Type:
bool
- verbose
Whether to enable verbose logging.
- Type:
bool
- debug
Whether to enable debug logging.
- Type:
bool
- logger
Logger object for logging messages.
- Type:
logging.Logger
- boolean_filter_methods
List of boolean filter methods.
- Type:
list
- missing_filter_methods
List of missing data filter methods.
- Type:
list
- maf_filter_methods
List of MAF filter methods.
- Type:
list
- mpl_params
Default Matplotlib parameters for the plots.
- Type:
dict
- plot_sankey_filtering_report()
Plot a Sankey diagram for the filtering report.
- plot_pca()
Plot a PCA scatter plot with 2 or 3 dimensions, colored by missing data proportions, and labeled by population with symbols for each sample.
- plot_summary_statistics()[source]
Plot summary statistics per sample and per population on the same figure. The summary statistics are plotted as lines for each statistic (Ho, He, Pi, Fst).
- plot_dapc()
Plot a DAPC scatter plot. with 2 or 3 dimensions, colored by population, and labeled by population with symbols for each sample.
- plot_fst_heatmap()
Plot a heatmap of Fst values between populations, sorted by highest Fst and displaying only the lower triangle.
- plot_fst_outliers()[source]
Plot a heatmap of Fst values for outlier SNPs, highlighting contributing population pairs.
- _set_logger()
Set the logger object based on the debug attribute. If debug is True, the logger will log debug messages.
- _get_attribute_value()[source]
Determine the value for an attribute based on the provided argument, genotype_data attribute, or default value. If a value is provided during initialization, it is used. Otherwise, the genotype_data attribute is used if available. If neither is available, the default value is used.
- _plot_summary_statistics_per_sample()[source]
Plot summary statistics per sample. If an axis is provided, the plot is drawn on that axis.
- _plot_summary_statistics_per_population()[source]
Plot summary statistics per population. If an axis is provided, the plot is drawn on that axis.
- __init__(genotype_data, show=False, plot_format='png', dpi=300, plot_fontsize=18, plot_title_fontsize=22, despine=True, verbose=False, debug=False)[source]
Initialize the Plotting class.
This class contains various methods for generating plots based on genotype data. The class is initialized with a GenotypeData object containing necessary data. The class attributes are set based on the provided values, the GenotypeData object, or default values.
- Parameters:
genotype_data (GenotypeData) – Initialized GenotypeData object containing necessary data.
show (bool) – Whether to display the plots. Defaults to genotype_data.show if available, otherwise False.
plot_format (str) – The format in which to save the plots (e.g., ‘png’, ‘svg’). Defaults to genotype_data.plot_format if available, otherwise ‘png’.
dpi (int) – The resolution of the saved plots. Unused for vector plot_format types. Defaults to genotype_data.dpi if available, otherwise 300.
plot_fontsize (int) – The font size for the plot labels. Defaults to genotype_data.plot_fontsize if available, otherwise 18.
plot_title_fontsize (int) – The font size for the plot titles. Defaults to genotype_data.plot_title_fontsize if available, otherwise 22.
despine (bool) – Whether to remove the top and right plot axis spines. Defaults to genotype_data.despine if available, otherwise True.
verbose (bool) – Whether to enable verbose logging. Defaults to genotype_data.verbose if available, otherwise False.
debug (bool) – Whether to enable debug logging. Defaults to genotype_data.debug if available, otherwise False.
Note
The show, plot_format, dpi, plot_fontsize, plot_title_fontsize, despine, verbose, and debug attributes are set based on the provided values, the genotype_data object, or default values.
The output_dir attribute is set to the prefix_output/nremover/plots directory or the prefix_output/plots directory if the genotype data was not filtered when initializing the Plotting class.
The mpl_params dictionary contains default Matplotlib parameters for the plots and are updated with the mpl_params dictionary.
The plotting object is used to set the attributes based on the provided values, the genotype_data object, or default values.
Methods
__init__(genotype_data[, show, plot_format, ...])Initialize the Plotting class.
plot_allele_summary(summary[, figsize])Plot allele summary statistics from summarize_alleles output.
plot_d_statistics(df, method)Create and save D-statistics plots and MultiQC reports.
plot_d_statistics_heatmap(df[, method_name])Plots a heatmap of D-statistics colored by -log10(P-value).
plot_dist_matrix(df, *[, pvals, palette, ...])Plot distance matrix.
plot_dstat_chi_square_distribution(df, ...)Plots the distribution of Chi-square values for D-statistics.
plot_dstat_pvalue_distribution(df, method_name)Plots the distribution of -log10(P-values) for D-statistics.
plot_dstat_significance_counts(df, method_name)Plots the number of significant results per D-statistic.
plot_fst_outliers(outlier_snps, method[, ...])Create a heatmap of Fst values for outlier SNPs, highlighting contributing population pairs.
plot_gt_distribution(df[, annotation_size])Plot the distribution of genotype counts.
plot_permutation_dist(obs_fst, dist, ...[, ...])Plot the permutation distribution of Fst values.
plot_pop_counts(populations)Plot the population counts.
plot_search_results(df_combined)Plot and save the filtering results based on the available data.
plot_stacked_significance_barplot(df, ...)Creates a stacked bar plot of significance categories.
plot_summary_statistics(summary_statistics)Plot summary statistics per sample and per population.
visualize_missingness(df[, prefix, zoom, ...])Visualize missing data across loci, individuals, and populations.