Note
Go to the end to download the full example code.
Tutorial 5: Creating a dataset class#
# Author: Gregoire Cattan
#
# https://github.com/plcrodrigues/Workshop-MOABB-BCI-Graz-2019
from pyriemann.classification import MDM
from pyriemann.estimation import ERPCovariances
from sklearn.pipeline import make_pipeline
from moabb.datasets import Cattan2019_VR
from moabb.datasets.braininvaders import BI2014a
from moabb.datasets.compound_dataset import CompoundDataset
from moabb.datasets.utils import blocks_reps
from moabb.evaluations import WithinSessionEvaluation
from moabb.paradigms.p300 import P300
Initialization#
This tutorial illustrates how to use the CompoundDataset to: 1) Select a few subjects/sessions/runs in an existing dataset 2) Merge two CompoundDataset into a new one 3) … and finally use this new dataset on a pipeline (this steps is not specific to CompoundDataset)
Let’s define a paradigm and a pipeline for evaluation first.
paradigm = P300()
pipelines = {}
pipelines["MDM"] = make_pipeline(ERPCovariances(estimator="lwf"), MDM(metric="riemann"))
Creation a selection of subject#
We are going to great two CompoundDataset, namely CustomDataset1 & 2. A CompoundDataset accepts a subjects_list of subjects. It is a list of tuple. A tuple contains 4 values:
the original dataset
the subject number to select
the sessions. It can be:
a session name (‘0’)
a list of sessions ([‘0’, ‘1’])
None to select all the sessions attributed to a subject
the runs. As for sessions, it can be a single run name, a list or None` (to select all runs).
class CustomDataset1(CompoundDataset):
def __init__(self):
biVR = Cattan2019_VR(virtual_reality=True, screen_display=True)
runs = blocks_reps([0, 2], [0, 1, 2, 3, 4], biVR.n_repetitions)
subjects_list = [
(biVR, 1, "0VR", runs),
(biVR, 2, "0VR", runs),
]
CompoundDataset.__init__(
self,
subjects_list=subjects_list,
code="CustomDataset1",
interval=[0, 1.0],
)
class CustomDataset2(CompoundDataset):
def __init__(self):
bi2014 = BI2014a()
subjects_list = [
(bi2014, 4, None, None),
(bi2014, 7, None, None),
]
CompoundDataset.__init__(
self,
subjects_list=subjects_list,
code="CustomDataset2",
interval=[0, 1.0],
)
Merging the datasets#
We are now going to merge the two CompoundDataset into a single one. The implementation is straight forward. Instead of providing a list of subjects, you should provide a list of CompoundDataset. subjects_list = [CustomDataset1(), CustomDataset2()]
class CustomDataset3(CompoundDataset):
def __init__(self):
subjects_list = [CustomDataset1(), CustomDataset2()]
CompoundDataset.__init__(
self,
subjects_list=subjects_list,
code="CustomDataset3",
interval=[0, 1.0],
)
Evaluate and display#
Let’s use a WithinSessionEvaluation to evaluate our new dataset. If you already new how to do this, nothing changed: The CompoundDataset can be used as a normal dataset.
datasets = [CustomDataset3()]
evaluation = WithinSessionEvaluation(
paradigm=paradigm, datasets=datasets, overwrite=False, suffix="newdataset"
)
scores = evaluation.process(pipelines)
print(scores)
CustomDataset3-WithinSession: 0%| | 0/4 [00:00<?, ?it/s]
CustomDataset3-WithinSession: 25%|██▌ | 1/4 [00:10<00:32, 10.67s/it]
CustomDataset3-WithinSession: 50%|█████ | 2/4 [00:23<00:23, 11.69s/it]
0%| | 0.00/46.4M [00:00<?, ?B/s]
0%| | 13.3k/46.4M [00:00<07:34, 102kB/s]
0%| | 29.7k/46.4M [00:00<06:03, 127kB/s]
0%| | 94.2k/46.4M [00:00<02:22, 325kB/s]
0%|▏ | 184k/46.4M [00:00<01:29, 515kB/s]
1%|▎ | 374k/46.4M [00:00<00:48, 944kB/s]
2%|▌ | 750k/46.4M [00:00<00:25, 1.78MB/s]
3%|█▏ | 1.50M/46.4M [00:00<00:13, 3.43MB/s]
7%|██▍ | 3.02M/46.4M [00:00<00:06, 6.72MB/s]
13%|████▊ | 6.05M/46.4M [00:01<00:03, 13.2MB/s]
22%|████████ | 10.1M/46.4M [00:01<00:01, 20.3MB/s]
29%|██████████▊ | 13.6M/46.4M [00:01<00:01, 23.8MB/s]
36%|█████████████▏ | 16.5M/46.4M [00:01<00:01, 24.7MB/s]
42%|███████████████▎ | 19.3M/46.4M [00:01<00:01, 24.9MB/s]
48%|█████████████████▉ | 22.5M/46.4M [00:01<00:00, 26.2MB/s]
56%|████████████████████▊ | 26.1M/46.4M [00:01<00:00, 28.1MB/s]
62%|███████████████████████ | 29.0M/46.4M [00:01<00:00, 27.7MB/s]
69%|█████████████████████████▋ | 32.2M/46.4M [00:01<00:00, 27.9MB/s]
76%|████████████████████████████▏ | 35.4M/46.4M [00:02<00:00, 28.3MB/s]
84%|██████████████████████████████▉ | 38.8M/46.4M [00:02<00:00, 29.0MB/s]
90%|█████████████████████████████████▎ | 41.8M/46.4M [00:02<00:00, 28.6MB/s]
97%|████████████████████████████████████ | 45.2M/46.4M [00:02<00:00, 29.1MB/s]
0%| | 0.00/46.4M [00:00<?, ?B/s]
100%|██████████████████████████████████████| 46.4M/46.4M [00:00<00:00, 167GB/s]
CustomDataset3-WithinSession: 75%|███████▌ | 3/4 [00:43<00:15, 15.70s/it]
0%| | 0.00/74.3M [00:00<?, ?B/s]
0%| | 12.3k/74.3M [00:00<14:31, 85.2kB/s]
0%| | 41.0k/74.3M [00:00<07:00, 176kB/s]
0%| | 96.3k/74.3M [00:00<03:55, 315kB/s]
0%| | 199k/74.3M [00:00<02:15, 548kB/s]
1%|▏ | 425k/74.3M [00:00<01:08, 1.08MB/s]
1%|▍ | 883k/74.3M [00:00<00:34, 2.10MB/s]
2%|▉ | 1.80M/74.3M [00:00<00:17, 4.10MB/s]
5%|█▊ | 3.63M/74.3M [00:00<00:08, 8.05MB/s]
10%|███▌ | 7.13M/74.3M [00:01<00:04, 15.3MB/s]
14%|█████ | 10.3M/74.3M [00:01<00:03, 19.2MB/s]
17%|██████▎ | 12.8M/74.3M [00:01<00:03, 20.3MB/s]
21%|███████▉ | 15.9M/74.3M [00:01<00:02, 22.8MB/s]
26%|█████████▍ | 19.0M/74.3M [00:01<00:02, 24.2MB/s]
29%|██████████▉ | 21.8M/74.3M [00:01<00:02, 24.7MB/s]
33%|████████████▎ | 24.7M/74.3M [00:01<00:01, 25.1MB/s]
37%|█████████████▊ | 27.6M/74.3M [00:01<00:01, 25.5MB/s]
41%|███████████████ | 30.4M/74.3M [00:01<00:01, 25.3MB/s]
45%|████████████████▋ | 33.4M/74.3M [00:02<00:01, 26.1MB/s]
49%|██████████████████▎ | 36.7M/74.3M [00:02<00:01, 27.1MB/s]
53%|███████████████████▊ | 39.7M/74.3M [00:02<00:01, 27.0MB/s]
57%|█████████████████████▏ | 42.4M/74.3M [00:02<00:01, 26.4MB/s]
61%|██████████████████████▋ | 45.5M/74.3M [00:02<00:01, 26.9MB/s]
65%|████████████████████████▏ | 48.5M/74.3M [00:02<00:00, 27.0MB/s]
69%|█████████████████████████▌ | 51.3M/74.3M [00:02<00:00, 26.3MB/s]
74%|███████████████████████████▌ | 55.3M/74.3M [00:02<00:00, 29.5MB/s]
79%|█████████████████████████████ | 58.3M/74.3M [00:02<00:00, 25.7MB/s]
82%|██████████████████████████████▍ | 61.2M/74.3M [00:03<00:00, 25.7MB/s]
86%|███████████████████████████████▉ | 64.0M/74.3M [00:03<00:00, 25.5MB/s]
90%|█████████████████████████████████▎ | 66.9M/74.3M [00:03<00:00, 25.7MB/s]
94%|██████████████████████████████████▋ | 69.5M/74.3M [00:03<00:00, 25.3MB/s]
98%|████████████████████████████████████▏| 72.7M/74.3M [00:03<00:00, 26.2MB/s]
0%| | 0.00/74.3M [00:00<?, ?B/s]
100%|██████████████████████████████████████| 74.3M/74.3M [00:00<00:00, 308GB/s]
CustomDataset3-WithinSession: 100%|██████████| 4/4 [01:18<00:00, 23.13s/it]
CustomDataset3-WithinSession: 100%|██████████| 4/4 [01:18<00:00, 19.52s/it]
score time ... pipeline codecarbon_task_name
0 0.700000 0.330859 ... MDM dd36a2fa-655d-4941-8ed7-1adc052be0c1
1 0.602500 0.308502 ... MDM d95e9641-01fc-4441-b11b-474b9d93e7f8
2 0.643615 1.959295 ... MDM 299a609d-b75b-4518-b95c-2b0790f638bc
3 0.524963 4.315973 ... MDM 356778d6-b5c2-4f96-9b6c-65b478130efb
[4 rows x 11 columns]
Total running time of the script: (1 minutes 19.184 seconds)
Estimated memory usage: 590 MB