Skip to content

Preprocessing#

Preprocessing, augmentation, and split helpers.

Classes:

Name Description
PreparedSplits

Preprocessed arrays ready for classifier training.

Functions:

Name Description
set_numpy_seed

Set NumPy's global random seed.

labels_to_zero_based

Convert BCI IV-2a event labels 769-772 to class IDs 0-3.

one_hot

Convert integer class labels to one-hot rows.

augment_trials

Apply the package temporal augmentation routine.

to_channels_last_image

Convert (trials, channels, time) to (trials, time, 1, channels).

select_subject

Filter arrays to one subject, or return copies for all subjects.

prepare_splits

Create augmented train/validation/test arrays ready for classifiers.

prepare_gan_training_arrays

Normalize classifier arrays into the GAN sequence format.

PreparedSplits dataclass #

PreparedSplits(
    x_train: ndarray,
    y_train: ndarray,
    x_valid: ndarray,
    y_valid: ndarray,
    x_test: ndarray,
    y_test: ndarray,
)

Preprocessed arrays ready for classifier training.

Attributes:

Name Type Description
x_train ndarray

Augmented training inputs shaped (n, time, 1, channels).

y_train ndarray

One-hot training labels shaped (n, 4).

x_valid ndarray

Augmented validation inputs.

y_valid ndarray

One-hot validation labels.

x_test ndarray

Augmented held-out test inputs.

y_test ndarray

One-hot held-out test labels.

set_numpy_seed #

set_numpy_seed(seed: int) -> None

Set NumPy's global random seed.

Parameters:

Name Type Description Default
seed int

Integer seed value.

required

labels_to_zero_based #

labels_to_zero_based(
    y: ndarray, label_offset: int = 769
) -> ndarray

Convert BCI IV-2a event labels 769-772 to class IDs 0-3.

Parameters:

Name Type Description Default
y ndarray

Raw BCI cue labels.

required
label_offset int

Value subtracted from each label.

769

Returns:

Type Description
ndarray

Integer class labels in [0, 3].

Raises:

Type Description
ValueError

If converted labels fall outside [0, 3].

one_hot #

one_hot(y: ndarray, n_classes: int = 4) -> ndarray

Convert integer class labels to one-hot rows.

Parameters:

Name Type Description Default
y ndarray

Integer labels.

required
n_classes int

Number of classes.

4

Returns:

Type Description
ndarray

Float32 one-hot array shaped (len(y), n_classes).

augment_trials #

augment_trials(
    X: ndarray,
    y: ndarray,
    config: PreprocessingConfig,
    rng: Generator | None = None,
) -> tuple[ndarray, ndarray]

Apply the package temporal augmentation routine.

The routine trims each trial, creates a max-pooled copy, creates an averaged copy with Gaussian noise, and appends each temporal subsampling offset.

Parameters:

Name Type Description Default
X ndarray

EEG trials shaped (trials, channels, time).

required
y ndarray

Trial labels aligned with X.

required
config PreprocessingConfig

Preprocessing and augmentation settings.

required
rng Generator | None

Optional NumPy random generator.

None

Returns:

Type Description
tuple[ndarray, ndarray]

Tuple of augmented trials and aligned labels.

Raises:

Type Description
ValueError

If array dimensions or temporal settings are incompatible.

to_channels_last_image #

to_channels_last_image(X: ndarray) -> ndarray

Convert (trials, channels, time) to (trials, time, 1, channels).

Parameters:

Name Type Description Default
X ndarray

EEG trials shaped (trials, channels, time).

required

Returns:

Type Description
ndarray

Float32 channels-last image-like representation used by classifiers.

select_subject #

select_subject(
    X: ndarray,
    y: ndarray,
    persons: ndarray,
    subject_id: int | None,
) -> tuple[ndarray, ndarray]

Filter arrays to one subject, or return copies for all subjects.

Parameters:

Name Type Description Default
X ndarray

EEG trial array.

required
y ndarray

Labels aligned with X.

required
persons ndarray

Subject IDs aligned with X.

required
subject_id int | None

Subject ID to keep, or None for all subjects.

required

Returns:

Type Description
tuple[ndarray, ndarray]

Filtered (X, y) pair.

Raises:

Type Description
ValueError

If subject_id is provided but no matching trials exist.

prepare_splits #

prepare_splits(
    bundle: DatasetBundle,
    preprocess: PreprocessingConfig | None = None,
    split: DataSplitConfig | None = None,
) -> PreparedSplits

Create augmented train/validation/test arrays ready for classifiers.

Parameters:

Name Type Description Default
bundle DatasetBundle

Raw processed arrays in the original six-file layout.

required
preprocess PreprocessingConfig | None

Optional preprocessing configuration.

None
split DataSplitConfig | None

Optional split configuration.

None

Returns:

Type Description
PreparedSplits

PreparedSplits containing channels-last inputs and one-hot labels.

prepare_gan_training_arrays #

prepare_gan_training_arrays(
    x_train: ndarray, y_train: ndarray
) -> tuple[ndarray, ndarray, float]

Normalize classifier arrays into the GAN sequence format.

Parameters:

Name Type Description Default
x_train ndarray

Classifier inputs shaped (n, time, 1, channels).

required
y_train ndarray

One-hot labels.

required

Returns:

Type Description
ndarray

Tuple of normalized sequence data (n, time, channels), labels, and the

ndarray

scale factor needed to invert the normalization.