Playing with the pre-processing steps#

By default, MOABB uses fundamental and robust pre-processing steps defined in each paradigm.

Behind the curtains, these steps are defined in a scikit-learn Pipeline. This pipeline receives raw signals and applies various signal processing steps to construct the final array object and class labels, which will be used to train and evaluate the classifiers.

Pre-processing steps are known to shape the rank and metric results of the EEG Decoding [2], [3], [4], and we present some discussion in our largest benchmark paper [1] on why we used those specific steps. Using the same pre-processing steps for all datasets also avoids biases and makes results more comparable.

However, there might be cases where these steps are not adequate. MOABB allows you to modify the pre-processing pipeline. In this example, we will show how to use the make_process_pipelines method to create a custom pre-processing pipeline. We will use the MinMaxScaler from sklearn to scale the data channels to the range [0, 1].

References#

# Authors: Bruno Aristimunha Pinto <b.aristimunha@gmail.com>
#
# License: BSD (3-clause)

What is applied precisely to each paradigm?#

Each paradigm defines a set of pre-processing steps that are applied to the raw data in order to construct the numpy arrays and class labels used for classification. In MOABB, the pre-processing steps are divided into three groups: the steps which are applied over the raw objects, those applied to the epoch objects, and those for the array objects.

First things, let’s define one dataset and one paradigm. Here, we will use the BNCI2014_001 dataset and the LeftRightImagery paradigm.

import pandas as pd
from sklearn.dummy import DummyClassifier
from sklearn.preprocessing import MinMaxScaler

from moabb.datasets import BNCI2014_001
from moabb.datasets.bids_interface import StepType
from moabb.evaluations import CrossSessionEvaluation
from moabb.paradigms import FilterBankLeftRightImagery, LeftRightImagery


dataset = BNCI2014_001()
# Select one subject for the example. You can use the dataset for all subjects
dataset.subject_list = dataset.subject_list[:1]

paradigm = LeftRightImagery()

Exposing the pre-processing steps#

The most efficient way to expose the pre-processing steps is to use the make_process_pipelines method. This method will return a list of pipelines that are applied to the raw data. The pipelines are defined in the paradigm object.

process_pipeline = paradigm.make_process_pipelines(dataset)

# On the not filterbank paradigm, we have only one branch of possible steps steps:
process_pipeline[0]
Pipeline(steps=[(<StepType.RAW: 'raw'>,
                 SetRawAnnotations(event_id={'feet': 3, 'left_hand': 1,
                                             'right_hand': 2, 'tongue': 4},
                                   interval=[2, 6])),
                (<StepType.RAW: 'raw'>,
                 NamedFunctionTransformer(display_name='Band Pass Filter (8–32 Hz)', func=operator.methodcaller('filter', l_freq=8, h_freq=32, method='iir', picks='data', verbose=False))),
                (<StepType.EPOCHS: 'epochs'>,
                 Pipeli...
                                                   RawToEpochs(baseline=None,
                                                               event_id={'left_hand': 1,
                                                                         'right_hand': 2},
                                                               tmax=6,
                                                               tmin=2.0))]))])),
                (<StepType.ARRAY: 'array'>,
                 ForkPipelines(transformers=[('X',
                                              Pipeline(steps=[('get_data',
                                                               FunctionTransformer(func=operator.methodcaller('get_data'))),
                                                              ('scaling',
                                                               FunctionTransformer(func=operator.methodcaller('__mul__', 1000000.0)))])),
                                             ('events', EpochsToEvents())]))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.


Filter Bank Paradigm#

On the filterbank paradigm, we have n branches in the case of multiple filters:

paradigm_filterbank = FilterBankLeftRightImagery()
pre_procesing_filter_bank_steps = paradigm_filterbank.make_process_pipelines(dataset)

# By default, we have six filter banks, and each filter bank has the same steps.
for i, step in enumerate(pre_procesing_filter_bank_steps):
    print(f"Filter bank {i}: {step}")
Filter bank 0: Pipeline(steps=[(<StepType.RAW: 'raw'>,
                 SetRawAnnotations(event_id={'feet': 3, 'left_hand': 1,
                                             'right_hand': 2, 'tongue': 4},
                                   interval=[2, 6])),
                (<StepType.RAW: 'raw'>,
                 NamedFunctionTransformer(display_name='Band Pass Filter (8–12 Hz)', func=operator.methodcaller('filter', l_freq=8, h_freq=12, method='iir', picks='data', verbose=False))),
                (<StepType.EPOCHS: 'epochs'>,
                 Pipeli...
                                                   RawToEpochs(baseline=None,
                                                               event_id={'left_hand': 1,
                                                                         'right_hand': 2},
                                                               tmax=6,
                                                               tmin=2.0))]))])),
                (<StepType.ARRAY: 'array'>,
                 ForkPipelines(transformers=[('X',
                                              Pipeline(steps=[('get_data',
                                                               FunctionTransformer(func=operator.methodcaller('get_data'))),
                                                              ('scaling',
                                                               FunctionTransformer(func=operator.methodcaller('__mul__', 1000000.0)))])),
                                             ('events', EpochsToEvents())]))])
Filter bank 1: Pipeline(steps=[(<StepType.RAW: 'raw'>,
                 SetRawAnnotations(event_id={'feet': 3, 'left_hand': 1,
                                             'right_hand': 2, 'tongue': 4},
                                   interval=[2, 6])),
                (<StepType.RAW: 'raw'>,
                 NamedFunctionTransformer(display_name='Band Pass Filter (12–16 Hz)', func=operator.methodcaller('filter', l_freq=12, h_freq=16, method='iir', picks='data', verbose=False))),
                (<StepType.EPOCHS: 'epochs'>,
                 Pipe...
                                                   RawToEpochs(baseline=None,
                                                               event_id={'left_hand': 1,
                                                                         'right_hand': 2},
                                                               tmax=6,
                                                               tmin=2.0))]))])),
                (<StepType.ARRAY: 'array'>,
                 ForkPipelines(transformers=[('X',
                                              Pipeline(steps=[('get_data',
                                                               FunctionTransformer(func=operator.methodcaller('get_data'))),
                                                              ('scaling',
                                                               FunctionTransformer(func=operator.methodcaller('__mul__', 1000000.0)))])),
                                             ('events', EpochsToEvents())]))])
Filter bank 2: Pipeline(steps=[(<StepType.RAW: 'raw'>,
                 SetRawAnnotations(event_id={'feet': 3, 'left_hand': 1,
                                             'right_hand': 2, 'tongue': 4},
                                   interval=[2, 6])),
                (<StepType.RAW: 'raw'>,
                 NamedFunctionTransformer(display_name='Band Pass Filter (16–20 Hz)', func=operator.methodcaller('filter', l_freq=16, h_freq=20, method='iir', picks='data', verbose=False))),
                (<StepType.EPOCHS: 'epochs'>,
                 Pipe...
                                                   RawToEpochs(baseline=None,
                                                               event_id={'left_hand': 1,
                                                                         'right_hand': 2},
                                                               tmax=6,
                                                               tmin=2.0))]))])),
                (<StepType.ARRAY: 'array'>,
                 ForkPipelines(transformers=[('X',
                                              Pipeline(steps=[('get_data',
                                                               FunctionTransformer(func=operator.methodcaller('get_data'))),
                                                              ('scaling',
                                                               FunctionTransformer(func=operator.methodcaller('__mul__', 1000000.0)))])),
                                             ('events', EpochsToEvents())]))])
Filter bank 3: Pipeline(steps=[(<StepType.RAW: 'raw'>,
                 SetRawAnnotations(event_id={'feet': 3, 'left_hand': 1,
                                             'right_hand': 2, 'tongue': 4},
                                   interval=[2, 6])),
                (<StepType.RAW: 'raw'>,
                 NamedFunctionTransformer(display_name='Band Pass Filter (20–24 Hz)', func=operator.methodcaller('filter', l_freq=20, h_freq=24, method='iir', picks='data', verbose=False))),
                (<StepType.EPOCHS: 'epochs'>,
                 Pipe...
                                                   RawToEpochs(baseline=None,
                                                               event_id={'left_hand': 1,
                                                                         'right_hand': 2},
                                                               tmax=6,
                                                               tmin=2.0))]))])),
                (<StepType.ARRAY: 'array'>,
                 ForkPipelines(transformers=[('X',
                                              Pipeline(steps=[('get_data',
                                                               FunctionTransformer(func=operator.methodcaller('get_data'))),
                                                              ('scaling',
                                                               FunctionTransformer(func=operator.methodcaller('__mul__', 1000000.0)))])),
                                             ('events', EpochsToEvents())]))])
Filter bank 4: Pipeline(steps=[(<StepType.RAW: 'raw'>,
                 SetRawAnnotations(event_id={'feet': 3, 'left_hand': 1,
                                             'right_hand': 2, 'tongue': 4},
                                   interval=[2, 6])),
                (<StepType.RAW: 'raw'>,
                 NamedFunctionTransformer(display_name='Band Pass Filter (24–28 Hz)', func=operator.methodcaller('filter', l_freq=24, h_freq=28, method='iir', picks='data', verbose=False))),
                (<StepType.EPOCHS: 'epochs'>,
                 Pipe...
                                                   RawToEpochs(baseline=None,
                                                               event_id={'left_hand': 1,
                                                                         'right_hand': 2},
                                                               tmax=6,
                                                               tmin=2.0))]))])),
                (<StepType.ARRAY: 'array'>,
                 ForkPipelines(transformers=[('X',
                                              Pipeline(steps=[('get_data',
                                                               FunctionTransformer(func=operator.methodcaller('get_data'))),
                                                              ('scaling',
                                                               FunctionTransformer(func=operator.methodcaller('__mul__', 1000000.0)))])),
                                             ('events', EpochsToEvents())]))])
Filter bank 5: Pipeline(steps=[(<StepType.RAW: 'raw'>,
                 SetRawAnnotations(event_id={'feet': 3, 'left_hand': 1,
                                             'right_hand': 2, 'tongue': 4},
                                   interval=[2, 6])),
                (<StepType.RAW: 'raw'>,
                 NamedFunctionTransformer(display_name='Band Pass Filter (28–32 Hz)', func=operator.methodcaller('filter', l_freq=28, h_freq=32, method='iir', picks='data', verbose=False))),
                (<StepType.EPOCHS: 'epochs'>,
                 Pipe...
                                                   RawToEpochs(baseline=None,
                                                               event_id={'left_hand': 1,
                                                                         'right_hand': 2},
                                                               tmax=6,
                                                               tmin=2.0))]))])),
                (<StepType.ARRAY: 'array'>,
                 ForkPipelines(transformers=[('X',
                                              Pipeline(steps=[('get_data',
                                                               FunctionTransformer(func=operator.methodcaller('get_data'))),
                                                              ('scaling',
                                                               FunctionTransformer(func=operator.methodcaller('__mul__', 1000000.0)))])),
                                             ('events', EpochsToEvents())]))])
How to include extra steps?

The paradigm object accepts parameters to configure common pre-processing and epoching steps applied to the raw data. These include:

  • Bandpass filtering (filters)

  • Event selection for epoching (events)

  • Epoch time window definition (tmin, tmax)

  • Baseline correction (baseline)

  • Channel selection (channels)

  • Resampling (resample)

The following example demonstrates how you can surgically add custom processing steps beyond these built-in options.

In this example, we want to add a min-max function step to the raw data to do this. We need to do pipeline surgery and use the evaluation function.


Now that you have defined some special pre-processing, you will need to run with evaluation function to get the results. Here, we will use the DummyClassifier from sklearn to run the evaluation.

classifier_pipeline = {}
classifier_pipeline["dummy"] = DummyClassifier()

evaluation = CrossSessionEvaluation(paradigm=paradigm)

generator_results = evaluation.evaluate(
    dataset=dataset,
    pipelines=classifier_pipeline,
    param_grid=None,
    process_pipeline=process_pipeline,
)
# The evaluation function will return a generator object that contains the results
# of the evaluation. You can use the `list` function to convert it to a list.
results = list(generator_results)
BNCI2014-001-CrossSession:   0%|          | 0/1 [00:00<?, ?it/s]
BNCI2014-001-CrossSession: 100%|██████████| 1/1 [00:04<00:00,  4.16s/it]
BNCI2014-001-CrossSession: 100%|██████████| 1/1 [00:04<00:00,  4.16s/it]

Plot Results#

Then you can follow the common procedure for analyzing the results.

df_results = pd.DataFrame(results)

df_results.plot(
    x="pipeline",
    y="score",
    kind="bar",
    title="Results of the evaluation with custom pre-processing steps",
    xlabel="Pipeline",
    ylabel="Score",
)
Results of the evaluation with custom pre-processing steps
<Axes: title={'center': 'Results of the evaluation with custom pre-processing steps'}, xlabel='Pipeline', ylabel='Score'>

Total running time of the script: (0 minutes 5.585 seconds)

Estimated memory usage: 568 MB

Gallery generated by Sphinx-Gallery