Note
Go to the end to download the full example code.
Tutorial 5: Creating a dataset class#
# Author: Gregoire Cattan
#
# https://github.com/plcrodrigues/Workshop-MOABB-BCI-Graz-2019
from pyriemann.classification import MDM
from pyriemann.estimation import ERPCovariances
from sklearn.pipeline import make_pipeline
from moabb.datasets import Cattan2019_VR
from moabb.datasets.braininvaders import BI2014a
from moabb.datasets.compound_dataset import CompoundDataset
from moabb.datasets.utils import blocks_reps
from moabb.evaluations import WithinSessionEvaluation
from moabb.paradigms.p300 import P300
Initialization#
This tutorial illustrates how to use the CompoundDataset to: 1) Select a few subjects/sessions/runs in an existing dataset 2) Merge two CompoundDataset into a new one 3) … and finally use this new dataset on a pipeline (this steps is not specific to CompoundDataset)
Let’s define a paradigm and a pipeline for evaluation first.
paradigm = P300()
pipelines = {}
pipelines["MDM"] = make_pipeline(ERPCovariances(estimator="lwf"), MDM(metric="riemann"))
Creation a selection of subject#
We are going to great two CompoundDataset, namely CustomDataset1 & 2. A CompoundDataset accepts a subjects_list of subjects. It is a list of tuple. A tuple contains 4 values:
the original dataset
the subject number to select
the sessions. It can be:
a session name (‘0’)
a list of sessions ([‘0’, ‘1’])
None to select all the sessions attributed to a subject
the runs. As for sessions, it can be a single run name, a list or None` (to select all runs).
class CustomDataset1(CompoundDataset):
def __init__(self):
biVR = Cattan2019_VR(virtual_reality=True, screen_display=True)
runs = blocks_reps([0, 2], [0, 1, 2, 3, 4], biVR.n_repetitions)
subjects_list = [
(biVR, 1, "0VR", runs),
(biVR, 2, "0VR", runs),
]
CompoundDataset.__init__(
self,
subjects_list=subjects_list,
code="CustomDataset1",
interval=[0, 1.0],
)
class CustomDataset2(CompoundDataset):
def __init__(self):
bi2014 = BI2014a()
subjects_list = [
(bi2014, 4, None, None),
(bi2014, 7, None, None),
]
CompoundDataset.__init__(
self,
subjects_list=subjects_list,
code="CustomDataset2",
interval=[0, 1.0],
)
Merging the datasets#
We are now going to merge the two CompoundDataset into a single one. The implementation is straight forward. Instead of providing a list of subjects, you should provide a list of CompoundDataset. subjects_list = [CustomDataset1(), CustomDataset2()]
class CustomDataset3(CompoundDataset):
def __init__(self):
subjects_list = [CustomDataset1(), CustomDataset2()]
CompoundDataset.__init__(
self,
subjects_list=subjects_list,
code="CustomDataset3",
interval=[0, 1.0],
)
Evaluate and display#
Let’s use a WithinSessionEvaluation to evaluate our new dataset. If you already new how to do this, nothing changed: The CompoundDataset can be used as a normal dataset.
datasets = [CustomDataset3()]
evaluation = WithinSessionEvaluation(
paradigm=paradigm, datasets=datasets, overwrite=False, suffix="newdataset"
)
scores = evaluation.process(pipelines)
print(scores)
CustomDataset3-WithinSession: 0%| | 0/4 [00:00<?, ?it/s]No hdf5_path provided, models will not be saved.
CustomDataset3-WithinSession: 25%|██▌ | 1/4 [00:08<00:26, 8.69s/it]No hdf5_path provided, models will not be saved.
CustomDataset3-WithinSession: 50%|█████ | 2/4 [00:17<00:17, 8.79s/it]
0%| | 0.00/46.4M [00:00<?, ?B/s]
0%| | 12.3k/46.4M [00:00<12:33, 61.6kB/s]
0%| | 41.0k/46.4M [00:00<06:22, 121kB/s]
0%| | 95.2k/46.4M [00:00<07:24, 104kB/s]
0%|▏ | 205k/46.4M [00:01<03:17, 235kB/s]
1%|▎ | 418k/46.4M [00:01<01:32, 497kB/s]
2%|▋ | 876k/46.4M [00:01<00:41, 1.09MB/s]
4%|█▎ | 1.71M/46.4M [00:01<00:20, 2.16MB/s]
7%|██▋ | 3.44M/46.4M [00:01<00:09, 4.47MB/s]
14%|█████▏ | 6.50M/46.4M [00:01<00:04, 8.42MB/s]
19%|███████ | 8.79M/46.4M [00:02<00:03, 9.95MB/s]
26%|█████████▋ | 12.1M/46.4M [00:02<00:02, 12.9MB/s]
31%|███████████▌ | 14.5M/46.4M [00:02<00:02, 13.2MB/s]
38%|██████████████▏ | 17.8M/46.4M [00:02<00:01, 15.2MB/s]
44%|████████████████▏ | 20.3M/46.4M [00:02<00:01, 15.2MB/s]
50%|██████████████████▋ | 23.4M/46.4M [00:02<00:01, 16.0MB/s]
56%|████████████████████▋ | 26.0M/46.4M [00:03<00:01, 15.9MB/s]
63%|███████████████████████▏ | 29.0M/46.4M [00:03<00:01, 16.5MB/s]
69%|█████████████████████████▌ | 32.0M/46.4M [00:03<00:00, 16.9MB/s]
76%|███████████████████████████▉ | 35.1M/46.4M [00:03<00:00, 17.1MB/s]
83%|██████████████████████████████▌ | 38.4M/46.4M [00:03<00:00, 17.8MB/s]
89%|████████████████████████████████▊ | 41.2M/46.4M [00:03<00:00, 17.6MB/s]
95%|███████████████████████████████████ | 44.0M/46.4M [00:04<00:00, 17.3MB/s]
0%| | 0.00/46.4M [00:00<?, ?B/s]
100%|██████████████████████████████████████| 46.4M/46.4M [00:00<00:00, 215GB/s]
No hdf5_path provided, models will not be saved.
CustomDataset3-WithinSession: 75%|███████▌ | 3/4 [00:34<00:12, 12.62s/it]
0%| | 0.00/74.3M [00:00<?, ?B/s]
0%| | 7.17k/74.3M [00:00<32:15, 38.4kB/s]
0%| | 39.9k/74.3M [00:00<09:41, 128kB/s]
0%| | 97.3k/74.3M [00:00<05:29, 225kB/s]
0%| | 179k/74.3M [00:00<03:44, 330kB/s]
1%|▏ | 377k/74.3M [00:00<01:57, 631kB/s]
1%|▍ | 770k/74.3M [00:01<01:00, 1.21MB/s]
2%|▊ | 1.57M/74.3M [00:01<00:30, 2.39MB/s]
4%|█▌ | 3.17M/74.3M [00:01<00:15, 4.69MB/s]
9%|███▏ | 6.36M/74.3M [00:01<00:07, 9.23MB/s]
14%|█████▏ | 10.4M/74.3M [00:01<00:04, 13.7MB/s]
18%|██████▋ | 13.5M/74.3M [00:01<00:04, 15.1MB/s]
20%|███████▍ | 15.0M/74.3M [00:02<00:04, 13.0MB/s]
24%|████████▊ | 17.6M/74.3M [00:02<00:04, 13.9MB/s]
29%|██████████▊ | 21.7M/74.3M [00:02<00:03, 16.9MB/s]
33%|████████████▎ | 24.6M/74.3M [00:02<00:02, 17.2MB/s]
37%|█████████████▊ | 27.7M/74.3M [00:02<00:02, 17.5MB/s]
41%|███████████████▎ | 30.8M/74.3M [00:02<00:02, 17.9MB/s]
46%|█████████████████▏ | 34.5M/74.3M [00:03<00:02, 19.2MB/s]
51%|██████████████████▉ | 38.0M/74.3M [00:03<00:01, 19.9MB/s]
56%|████████████████████▋ | 41.6M/74.3M [00:03<00:01, 20.5MB/s]
61%|██████████████████████▍ | 45.0M/74.3M [00:03<00:01, 20.4MB/s]
64%|███████████████████████▊ | 47.8M/74.3M [00:03<00:01, 19.3MB/s]
69%|█████████████████████████▋ | 51.5M/74.3M [00:03<00:01, 20.2MB/s]
74%|███████████████████████████▏ | 54.6M/74.3M [00:04<00:00, 20.0MB/s]
77%|████████████████████████████▋ | 57.5M/74.3M [00:04<00:00, 19.0MB/s]
83%|██████████████████████████████▌ | 61.3M/74.3M [00:04<00:00, 20.0MB/s]
87%|████████████████████████████████ | 64.4M/74.3M [00:04<00:00, 19.6MB/s]
92%|██████████████████████████████████ | 68.4M/74.3M [00:04<00:00, 21.0MB/s]
97%|███████████████████████████████████▊ | 72.0M/74.3M [00:04<00:00, 21.1MB/s]
0%| | 0.00/74.3M [00:00<?, ?B/s]
100%|██████████████████████████████████████| 74.3M/74.3M [00:00<00:00, 240GB/s]
No hdf5_path provided, models will not be saved.
CustomDataset3-WithinSession: 100%|██████████| 4/4 [01:05<00:00, 19.75s/it]
CustomDataset3-WithinSession: 100%|██████████| 4/4 [01:05<00:00, 16.35s/it]
score time samples ... n_sessions dataset pipeline
0 0.632500 0.330028 120.0 ... 1 CustomDataset3 MDM
1 0.520000 0.323644 120.0 ... 1 CustomDataset3 MDM
2 0.631404 2.106163 768.0 ... 1 CustomDataset3 MDM
3 0.544479 4.460083 1356.0 ... 1 CustomDataset3 MDM
[4 rows x 9 columns]
Total running time of the script: (1 minutes 6.006 seconds)
Estimated memory usage: 711 MB