Explore Paradigm Object¶
A paradigm defines how the raw data will be converted to trials ready to be processed by a decoding algorithm. This is a function of the paradigm used, i.e. in motor imagery one can have two-class, multi-class, or continuous paradigms; similarly, different preprocessing is necessary for ERP vs ERD paradigms.
A paradigm also defines the appropriate evaluation metric, for example AUC for binary classification problems, accuracy for multiclass, or kappa coefficients for continuous paradigms.
- This tutorial explores the paradigm object, with 3 examples of paradigm :
# Authors: Alexandre Barachant <email@example.com> # Sylvain Chevallier <firstname.lastname@example.org> # # License: BSD (3-clause) import numpy as np from moabb.datasets import BNCI2014001 from moabb.paradigms import FilterBankMotorImagery, LeftRightImagery, MotorImagery print(__doc__)
First, let’s take an example of the MotorImagery paradigm.
paradigm = MotorImagery(n_classes=4) print(paradigm.__doc__)
N-class motor imagery. Metric is 'roc-auc' if 2 classes and 'accuracy' if more Parameters ----------- events: List of str event labels used to filter datasets (e.g. if only motor imagery is desired). n_classes: int, number of classes each dataset must have. If events is given, requires all imagery sorts to be within the events list. fmin: float (default 8) cutoff frequency (Hz) for the high pass filter fmax: float (default 32) cutoff frequency (Hz) for the low pass filter tmin: float (default 0.0) Start time (in second) of the epoch, relative to the dataset specific task interval e.g. tmin = 1 would mean the epoch will start 1 second after the begining of the task as defined by the dataset. tmax: float | None, (default None) End time (in second) of the epoch, relative to the begining of the dataset specific task interval. tmax = 5 would mean the epoch will end 5 second after the begining of the task as defined in the dataset. If None, use the dataset value. baseline: None | tuple of length 2 The time interval to consider as “baseline” when applying baseline correction. If None, do not apply baseline correction. If a tuple (a, b), the interval is between a and b (in seconds), including the endpoints. Correction is applied by computing the mean of the baseline period and subtracting it from the data (see mne.Epochs) channels: list of str | None (default None) list of channel to select. If None, use all EEG channels available in the dataset. resample: float | None (default None) If not None, resample the eeg data with the sampling rate provided.
The function get_data allow you to access preprocessed data from a dataset. this function will return 3 objects. A numpy array containing the preprocessed EEG data, the labels, and a dataframe with metadata.
Return the data for a list of subject. return the data, labels and a dataframe with metadata. the dataframe will contain at least the following columns - subject : the subject indice - session : the session indice - run : the run indice parameters ---------- dataset: A dataset instance. subjects: List of int List of subject number return_epochs: boolean This flag specifies whether to return only the data array or the complete processed mne.Epochs returns ------- X : Union[np.ndarray, mne.Epochs] the data that will be used as features for the model Note: if return_epochs=True, this is mne.Epochs if return_epochs=False, this is np.ndarray labels: np.ndarray the labels for training / evaluating the model metadata: pd.DataFrame A dataframe containing the metadata.
Lets take the example of the BNCI2014001 dataset, known as the dataset IIa from the BCI competition IV. We will load the data from the subject 1. When calling get_data, the paradigm will retrieve the data from the specified list of subjects, apply preprocessing (by default, a bandpass between 7 and 35 Hz), epoch the data (with interval specified by the dataset, unless superseded by the paradigm) and return the corresponding objects.
The epoched data is a 3D array, with epochs on the first dimension (here 576 trials), channels on the second (22 channels) and time sample on the last one.
(576, 22, 1001)
Labels contains the labels corresponding to each trial. in the case of this dataset, we have the 4 types of motor imagery that was performed.
['feet' 'left_hand' 'right_hand' 'tongue']
Metadata have at least 3 columns: subject, session and run.
subject is the subject id of the corresponding trial
session is the session id. A session denotes a recording made without removing the EEG cap.
run is the individual continuous recording made during a session. A session may or may not contain multiple runs.
subject session run 0 1 session_T run_0 1 1 session_T run_0 2 1 session_T run_0 3 1 session_T run_0 4 1 session_T run_0
For this data, we have one subject, 2 sessions (2 different recording days) and 6 runs per session.
subject session run count 576.0 576 576 unique NaN 2 6 top NaN session_E run_5 freq NaN 288 96 mean 1.0 NaN NaN std 0.0 NaN NaN min 1.0 NaN NaN 25% 1.0 NaN NaN 50% 1.0 NaN NaN 75% 1.0 NaN NaN max 1.0 NaN NaN
Paradigm objects can also return the list of all dataset compatible. Here it will return the list all the imagery datasets from the MOABB.
['Alexandre Motor Imagery', '001-2014', '002-2014', '004-2014', '001-2015', '004-2015', 'Cho2017', 'Lee2019_MI', 'Grosse-Wentrup 2009', 'Ofner2017', 'Physionet Motor Imagery', 'Schirrmeister2017', 'Shin2017A', 'Weibo 2014', 'Zhou 2016']
FilterBankMotorImagery is the same paradigm, but with a different preprocessing. In this case, it applies a bank of 6 bandpass filter on the data before concatenating the output.
paradigm = FilterBankMotorImagery() print(paradigm.__doc__)
Filter bank n-class motor imagery. Metric is 'roc-auc' if 2 classes and 'accuracy' if more Parameters ----------- events: List of str event labels used to filter datasets (e.g. if only motor imagery is desired). n_classes: int, number of classes each dataset must have. If events is given, requires all imagery sorts to be within the events list.
Therefore, the output X is a 4D array, with trial x channel x time x filter
(288, 22, 1001, 6)
LeftRightImagery is a variation over the BaseMotorImagery paradigm, restricted to left- and right-hand events.
paradigm = LeftRightImagery() print(paradigm.__doc__)
Motor Imagery for left hand/right hand classification Metric is 'roc_auc'
The compatible dataset list is a subset of motor imagery dataset that contains at least left and right hand events.
['001-2014', '004-2014', 'Cho2017', 'Lee2019_MI', 'Grosse-Wentrup 2009', 'Physionet Motor Imagery', 'Schirrmeister2017', 'Shin2017A', 'Weibo 2014', 'Zhou 2016']
So if we apply this to our original dataset, it will only return trials corresponding to left- and right-hand motor imagination.
Total running time of the script: ( 0 minutes 16.078 seconds)