binlearn.base.FlexibleBinningBase
- class binlearn.base.FlexibleBinningBase(preserve_dataframe: bool | None = None, fit_jointly: bool | None = None, guidance_columns: Any | None = None, *, bin_spec: dict[Any, list[Any]] | None = None, bin_representatives: dict[Any, list[float]] | None = None)[source]
Base class for flexible binning methods that support mixed bin types.
This class extends GeneralBinningBase to provide specialized functionality for flexible binning methods. Unlike traditional interval-based binning, flexible binning supports mixed bin types within the same feature, including singleton bins (exact value matching) and interval bins (range matching) in any combination.
Flexible binning is particularly useful for: - Categorical features with numeric representations - Mixed data types requiring different binning strategies per value - Custom binning schemes that don’t fit traditional interval patterns - Data with important singleton values that should be preserved exactly
Key Features: - Mixed bin types: Combine singleton and interval bins in the same feature - Custom bin specifications: Define bins as either exact values or ranges - Automatic representative generation: Creates numeric representatives for mixed bins - Flexible transformation: Handles both numeric and non-numeric data appropriately
- bin_spec
Dictionary mapping column identifiers to lists of flexible bin definitions. Each bin can be either a scalar (singleton) or tuple (interval).
- bin_representatives
Dictionary mapping column identifiers to lists of numeric representative values for each bin. Auto-generated if not provided.
Example
>>> # Example of flexible bin specification >>> bin_spec = { ... 'mixed_feature': [ ... 42, # Singleton bin: exactly value 42 ... (10, 20), # Interval bin: range [10, 20] ... 'special', # Categorical singleton ... (100, 200) # Another interval ... ] ... }
Note
This is an abstract base class - use concrete implementations like ManualFlexibleBinning
Bin representatives are automatically generated as midpoints for intervals and preserved values for singletons (with numeric conversion where possible)
Inherits all functionality from GeneralBinningBase including fit/transform interface
Subclasses must implement the abstract _do_fit_single_column method
- __init__(preserve_dataframe: bool | None = None, fit_jointly: bool | None = None, guidance_columns: Any | None = None, *, bin_spec: dict[Any, list[Any]] | None = None, bin_representatives: dict[Any, list[float]] | None = None)[source]
Initialize flexible binning base class.
- Parameters:
preserve_dataframe – Whether to preserve the original DataFrame format during transformation. If True, returns DataFrame when input is DataFrame. If False, returns numpy array. If None, uses the global configuration default from binlearn.config.preserve_dataframe.
fit_jointly – Whether to fit all columns together using shared information. If True, performs joint fitting across columns. If False, fits each column independently. If None, uses method-specific default behavior.
guidance_columns – Additional columns to use as guidance for binning decisions. These columns are not binned themselves but can influence the binning of other columns. Can be column names/indices or None for no guidance.
bin_spec – Pre-defined flexible bin specification as a dictionary mapping column identifiers to lists of bin definitions. Each bin definition can be either a scalar value (singleton bin) or a tuple (interval bin). If provided, no fitting is performed and this specification is used directly.
bin_representatives – Pre-defined representative values for each bin as a dictionary mapping column identifiers to lists of numeric values. Must match the structure of bin_spec if provided. If None, representatives are automatically generated from bin_spec.
- Raises:
ConfigurationError – If bin specifications are invalid or incompatible.
Example
>>> # Initialize with custom bin specification >>> bin_spec = { ... 'feature1': [10, (20, 30), 40], ... 'feature2': ['A', 'B', (1, 5)] ... } >>> binner = ConcreteFlexibleBinner(bin_spec=bin_spec) >>> >>> # Initialize for automatic fitting >>> binner = ConcreteFlexibleBinner(fit_jointly=True)
Note
When bin_spec is provided, the binning is pre-configured and fit() becomes a no-op
bin_representatives will be auto-generated if not provided with bin_spec
guidance_columns feature may not be supported by all flexible binning methods
All parameters are passed to the parent GeneralBinningBase constructor
Methods
__init__([preserve_dataframe, fit_jointly, ...])Initialize flexible binning base class.
check_data_quality(data[, name])Check data quality and issue warnings if needed.
fit(X[, y])Fit the binning transformer with comprehensive orchestration.
fit_transform(X[, y])Fit to data, then transform it.
get_input_columns()Get input columns for data preparation.
get_metadata_routing()Get metadata routing of this object.
get_params([deep])Get parameters for this estimator, including fitted parameters.
inverse_transform(X)Inverse transform from bin indices back to representative values.
set_output(*[, transform])Set output container.
set_params(**params)Set the parameters of this estimator.
transform(X)Transform input data using fitted binning parameters.
validate_array_like(data[, name, allow_none])Validate and convert array-like input to numpy array.
validate_column_specification(columns, ...)Validate column specifications.
validate_guidance_columns(guidance_cols, ...)Validate guidance column specifications.
Attributes
feature_names_in_Get feature names.
n_features_in_Get number of features.
- __init__(preserve_dataframe: bool | None = None, fit_jointly: bool | None = None, guidance_columns: Any | None = None, *, bin_spec: dict[Any, list[Any]] | None = None, bin_representatives: dict[Any, list[float]] | None = None)[source]
Initialize flexible binning base class.
- Parameters:
preserve_dataframe – Whether to preserve the original DataFrame format during transformation. If True, returns DataFrame when input is DataFrame. If False, returns numpy array. If None, uses the global configuration default from binlearn.config.preserve_dataframe.
fit_jointly – Whether to fit all columns together using shared information. If True, performs joint fitting across columns. If False, fits each column independently. If None, uses method-specific default behavior.
guidance_columns – Additional columns to use as guidance for binning decisions. These columns are not binned themselves but can influence the binning of other columns. Can be column names/indices or None for no guidance.
bin_spec – Pre-defined flexible bin specification as a dictionary mapping column identifiers to lists of bin definitions. Each bin definition can be either a scalar value (singleton bin) or a tuple (interval bin). If provided, no fitting is performed and this specification is used directly.
bin_representatives – Pre-defined representative values for each bin as a dictionary mapping column identifiers to lists of numeric values. Must match the structure of bin_spec if provided. If None, representatives are automatically generated from bin_spec.
- Raises:
ConfigurationError – If bin specifications are invalid or incompatible.
Example
>>> # Initialize with custom bin specification >>> bin_spec = { ... 'feature1': [10, (20, 30), 40], ... 'feature2': ['A', 'B', (1, 5)] ... } >>> binner = ConcreteFlexibleBinner(bin_spec=bin_spec) >>> >>> # Initialize for automatic fitting >>> binner = ConcreteFlexibleBinner(fit_jointly=True)
Note
When bin_spec is provided, the binning is pre-configured and fit() becomes a no-op
bin_representatives will be auto-generated if not provided with bin_spec
guidance_columns feature may not be supported by all flexible binning methods
All parameters are passed to the parent GeneralBinningBase constructor