Note
Go to the end to download the full example code.
Tutorial 0: Getting Started#
This tutorial takes you through a basic working example of how to use this codebase, including all the different components, up to the results generation. If you’d like to know about the statistics and plotting, see the next tutorial.
# Authors: Vinay Jayaram <vinayjayaram13@gmail.com>
#
# License: BSD (3-clause)
Introduction#
To use the codebase you need an evaluation and a paradigm, some algorithms, and a list of datasets to run it all on. You can find those in the following submodules; detailed tutorials are given for each of them.
import numpy as np
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import make_pipeline
from sklearn.svm import SVC
If you would like to specify the logging level when it is running, you can use the standard python logging commands through the top-level moabb module
import moabb
from moabb.datasets import BNCI2014_001, utils
from moabb.evaluations import CrossSessionEvaluation
from moabb.paradigms import LeftRightImagery
from moabb.pipelines.features import LogVariance
/home/runner/work/moabb/moabb/.venv/lib/python3.10/site-packages/optuna/integration/sklearn.py:14: FutureWarning: `optuna.integration.sklearn` has been deprecated in v4.9.0. This feature will be removed in v6.0.0. See https://github.com/optuna/optuna/releases/tag/v4.9.0. Use `optuna_integration.sklearn` instead.
optuna_warn(f"{msg} Use `optuna_integration.sklearn` instead.", FutureWarning)
In order to create pipelines within a script, you will likely need at least the make_pipeline function. They can also be specified via a .yml file. Here we will make a couple pipelines just for convenience
moabb.set_log_level("info")
Create pipelines#
We create two pipelines: channel-wise log variance followed by LDA, and channel-wise log variance followed by a cross-validated SVM (note that a cross-validation via scikit-learn cannot be described in a .yml file). For later in the process, the pipelines need to be in a dictionary where the key is the name of the pipeline and the value is the Pipeline object
pipelines = {}
pipelines["AM+LDA"] = make_pipeline(LogVariance(), LDA())
parameters = {"C": np.logspace(-2, 2, 10)}
clf = GridSearchCV(SVC(kernel="linear"), parameters)
pipe = make_pipeline(LogVariance(), clf)
pipelines["AM+SVM"] = pipe
Datasets#
Datasets can be specified in many ways: Each paradigm has a property ‘datasets’ which returns the datasets that are appropriate for that paradigm
print(LeftRightImagery().datasets)
[BNCI2014-001, BNCI2014-004, Beetl2021-A, Beetl2021-B, Brandl2020, Chang2025, Cho2017, Dreyer2023, Dreyer2023A, Dreyer2023B, Dreyer2023C, Forenzo2023, GrosseWentrup2009, GuttmannFlury2025-ME, GuttmannFlury2025-MI, HefmiIch2025, Kaya2018, Kumar2024, Lee2019-MI, Liu2024, PhysionetMotorImagery, Schirrmeister2017, Shin2017A, Stieger2021, Wairagkar2018, Weibo2014, Wu2020, Yang2025, Zhou2016, Zhou2020]
Or you can run a search through the available datasets:
print(utils.dataset_search(paradigm="imagery", min_subjects=6))
[AguileraRodriguez2025, AlexandreMotorImagery, BCIComp2020UpperLimb, BNCI2014-001, BNCI2014-002, BNCI2014-004, BNCI2015-001, BNCI2015-004, BNCI2019-001, BNCI2020-001, BNCI2022-001, BNCI2024-001, BNCI2025-001, BNCI2025-002, Brandl2020, Chang2025, Cho2017, Dreyer2023, Dreyer2023A, Dreyer2023B, Dreyer2023C, FakeDataset-imagery-10-2--60-60--120-120--fake1-fake2-fake3--c3-cz-c4, Forenzo2023, Gao2026, GrosseWentrup2009, GuttmannFlury2025-ME, GuttmannFlury2025-MI, HefmiIch2025, Jeong2020, Kaya2018, Kumar2024, Lee2019-MI, Liu2024, Liu2025, Ma2020, Nguyen2017-L, Nguyen2017-S, Nguyen2017-SL, Nguyen2017-V, Nieto2022, Ofner2017, PhysionetMotorImagery, Pressel2016, Rozado2015, Schirrmeister2017, Shin2017A, Shin2017B, Stieger2021, Tavakolan2017, TrianaGuzman2024, Wairagkar2018, Weibo2014, Wu2020, Yang2025, Yi2025, Zhang2017, Zhou2020, Zuo2025]
Or you can simply make your own list (which we do here due to computational constraints)
dataset = BNCI2014_001()
dataset.subject_list = dataset.subject_list[:2]
datasets = [dataset]
Paradigm#
Paradigms define the events, epoch time, bandpass, and other preprocessing parameters. They have defaults that you can read in the documentation, or you can simply set them as we do here. A single paradigm defines a method for going from continuous data to trial data of a fixed size. To learn more look at the tutorial Exploring Paradigms
fmin = 8
fmax = 35
# You can inject custom scoring directly into the paradigm (single or multi-metric).
custom_scorer = [accuracy_score, (roc_auc_score, {"needs_threshold": True})]
paradigm = LeftRightImagery(fmin=fmin, fmax=fmax, scorer=custom_scorer)
Evaluation#
An evaluation defines how the training and test sets are chosen. This could be cross-validated within a single recording, or across days, or sessions, or subjects. This also is the correct place to specify multiple threads.
evaluation = CrossSessionEvaluation(
paradigm=paradigm, datasets=datasets, suffix="examples", overwrite=False
)
results = evaluation.process(pipelines)
[codecarbon WARNING @ 11:52:31] Multiple instances of codecarbon are allowed to run at the same time.
2026-06-29 11:52:58,328 INFO MainThread moabb.evaluations.base AM+LDA | BNCI2014-001 | 1 | 0train: Score 0.729
2026-06-29 11:52:58,328 INFO MainThread moabb.evaluations.base AM+SVM | BNCI2014-001 | 1 | 0train: Score 0.743
2026-06-29 11:52:58,328 INFO MainThread moabb.evaluations.base AM+LDA | BNCI2014-001 | 1 | 1test: Score 0.715
2026-06-29 11:52:58,328 INFO MainThread moabb.evaluations.base AM+SVM | BNCI2014-001 | 1 | 1test: Score 0.715
2026-06-29 11:52:58,328 INFO MainThread moabb.evaluations.base AM+LDA | BNCI2014-001 | 2 | 0train: Score 0.597
2026-06-29 11:52:58,328 INFO MainThread moabb.evaluations.base AM+SVM | BNCI2014-001 | 2 | 0train: Score 0.500
2026-06-29 11:52:58,328 INFO MainThread moabb.evaluations.base AM+LDA | BNCI2014-001 | 2 | 1test: Score 0.521
2026-06-29 11:52:58,328 INFO MainThread moabb.evaluations.base AM+SVM | BNCI2014-001 | 2 | 1test: Score 0.500
/home/runner/work/moabb/moabb/moabb/analysis/results.py:189: H5pyDeprecationWarning: Creating a dataset without passing data or dtype is deprecated. Pass an explicit dtype. Using dtype='f4' will keep the current default behaviour.
dset.create_dataset(
Results are returned as a pandas DataFrame. When multiple metrics are provided, MOABB adds a primary score plus one column per metric (e.g., score_accuracy_score, score_roc_auc_score).
print(results.head())
score time ... pipeline codecarbon_task_name
0 0.729167 0.017820 ... AM+LDA 0778358d-3323-40bc-938f-41ccb37499b6
1 0.715278 0.013495 ... AM+LDA ff3fe0ff-4ba3-4ab1-aadf-ae7811728f02
2 0.597222 0.013212 ... AM+LDA 8665e479-e5ed-4894-a4d5-6059b7be4e00
3 0.520833 0.012714 ... AM+LDA f3bcd09c-2bd6-4c10-a184-fc71ee0dcbda
4 0.743056 0.133880 ... AM+SVM 537261a9-d315-4876-bfe8-fbd1dff3a093
[5 rows x 15 columns]
Total running time of the script: (0 minutes 35.740 seconds)