Preprocessing#

Preprocessing, augmentation, and split helpers.

Classes:

Name	Description
`PreparedSplits`	Preprocessed arrays ready for classifier training.

Functions:

Name	Description
`set_numpy_seed`	Set NumPy's global random seed.
`labels_to_zero_based`	Convert BCI IV-2a event labels `769-772` to class IDs `0-3`.
`one_hot`	Convert integer class labels to one-hot rows.
`augment_trials`	Apply the package temporal augmentation routine.
`to_channels_last_image`	Convert `(trials, channels, time)` to `(trials, time, 1, channels)`.
`select_subject`	Filter arrays to one subject, or return copies for all subjects.
`prepare_splits`	Create augmented train/validation/test arrays ready for classifiers.
`prepare_gan_training_arrays`	Normalize classifier arrays into the GAN sequence format.

PreparedSplits `dataclass` #

PreparedSplits(
    x_train: ndarray,
    y_train: ndarray,
    x_valid: ndarray,
    y_valid: ndarray,
    x_test: ndarray,
    y_test: ndarray,
)

Preprocessed arrays ready for classifier training.

Attributes:

Name	Type	Description
`x_train`	`ndarray`	Augmented training inputs shaped `(n, time, 1, channels)`.
`y_train`	`ndarray`	One-hot training labels shaped `(n, 4)`.
`x_valid`	`ndarray`	Augmented validation inputs.
`y_valid`	`ndarray`	One-hot validation labels.
`x_test`	`ndarray`	Augmented held-out test inputs.
`y_test`	`ndarray`	One-hot held-out test labels.

set_numpy_seed #

set_numpy_seed(seed: int) -> None

Set NumPy's global random seed.

Parameters:

Name	Type	Description	Default
`seed`	`int`	Integer seed value.	required

labels_to_zero_based #

labels_to_zero_based(
    y: ndarray, label_offset: int = 769
) -> ndarray

Convert BCI IV-2a event labels 769-772 to class IDs 0-3.

Parameters:

Name	Type	Description	Default
`y`	`ndarray`	Raw BCI cue labels.	required
`label_offset`	`int`	Value subtracted from each label.	`769`

Returns:

Type	Description
`ndarray`	Integer class labels in `[0, 3]`.

Raises:

Type	Description
`ValueError`	If converted labels fall outside `[0, 3]`.

one_hot #

one_hot(y: ndarray, n_classes: int = 4) -> ndarray

Convert integer class labels to one-hot rows.

Parameters:

Name	Type	Description	Default
`y`	`ndarray`	Integer labels.	required
`n_classes`	`int`	Number of classes.	`4`

Returns:

Type	Description
`ndarray`	Float32 one-hot array shaped `(len(y), n_classes)`.

augment_trials #

augment_trials(
    X: ndarray,
    y: ndarray,
    config: PreprocessingConfig,
    rng: Generator | None = None,
) -> tuple[ndarray, ndarray]

Apply the package temporal augmentation routine.

The routine trims each trial, creates a max-pooled copy, creates an averaged copy with Gaussian noise, and appends each temporal subsampling offset.

Parameters:

Name	Type	Description	Default
`X`	`ndarray`	EEG trials shaped `(trials, channels, time)`.	required
`y`	`ndarray`	Trial labels aligned with `X`.	required
`config`	`PreprocessingConfig`	Preprocessing and augmentation settings.	required
`rng`	`Generator \| None`	Optional NumPy random generator.	`None`

Returns:

Type	Description
`tuple[ndarray, ndarray]`	Tuple of augmented trials and aligned labels.

Raises:

Type	Description
`ValueError`	If array dimensions or temporal settings are incompatible.

to_channels_last_image #

to_channels_last_image(X: ndarray) -> ndarray

Convert (trials, channels, time) to (trials, time, 1, channels).

Parameters:

Name	Type	Description	Default
`X`	`ndarray`	EEG trials shaped `(trials, channels, time)`.	required

Returns:

Type	Description
`ndarray`	Float32 channels-last image-like representation used by classifiers.

select_subject #

select_subject(
    X: ndarray,
    y: ndarray,
    persons: ndarray,
    subject_id: int | None,
) -> tuple[ndarray, ndarray]

Filter arrays to one subject, or return copies for all subjects.

Parameters:

Name	Type	Description	Default
`X`	`ndarray`	EEG trial array.	required
`y`	`ndarray`	Labels aligned with `X`.	required
`persons`	`ndarray`	Subject IDs aligned with `X`.	required
`subject_id`	`int \| None`	Subject ID to keep, or `None` for all subjects.	required

Returns:

Type	Description
`tuple[ndarray, ndarray]`	Filtered `(X, y)` pair.

Raises:

Type	Description
`ValueError`	If `subject_id` is provided but no matching trials exist.

prepare_splits #

prepare_splits(
    bundle: DatasetBundle,
    preprocess: PreprocessingConfig | None = None,
    split: DataSplitConfig | None = None,
) -> PreparedSplits

Create augmented train/validation/test arrays ready for classifiers.

Parameters:

Name	Type	Description	Default
`bundle`	`DatasetBundle`	Raw processed arrays in the original six-file layout.	required
`preprocess`	`PreprocessingConfig \| None`	Optional preprocessing configuration.	`None`
`split`	`DataSplitConfig \| None`	Optional split configuration.	`None`

Returns:

Type	Description
`PreparedSplits`	`PreparedSplits` containing channels-last inputs and one-hot labels.

prepare_gan_training_arrays #

prepare_gan_training_arrays(
    x_train: ndarray, y_train: ndarray
) -> tuple[ndarray, ndarray, float]

Normalize classifier arrays into the GAN sequence format.

Parameters:

Name	Type	Description	Default
`x_train`	`ndarray`	Classifier inputs shaped `(n, time, 1, channels)`.	required
`y_train`	`ndarray`	One-hot labels.	required

Returns:

Type	Description
`ndarray`	Tuple of normalized sequence data `(n, time, channels)`, labels, and the
`ndarray`	scale factor needed to invert the normalization.

Preprocessing#

PreparedSplits dataclass #

set_numpy_seed #

labels_to_zero_based #

one_hot #

augment_trials #

to_channels_last_image #

select_subject #

prepare_splits #

prepare_gan_training_arrays #

PreparedSplits `dataclass` #