A dataset handle and abstract low level access to the data. the dataset will takes data stored locally, in the format in which they have been downloaded, and will convert them into a MNE raw object. There are options to pool all the different recording sessions per subject or to evaluate them separately.
See NeuroTechX/moabb for detail on datasets (electrodes, number of trials, sessions, etc.)
Data Summary#
MOABB gather many datasets, here is list summarizing important information. Most of the datasets are listed here but this list not complete yet, check API for complete documentation.
Do not hesitate to help us complete this list. It is also possible to add new datasets, there is a tutorial explaining how to do so, and we welcome warmly any new contributions!
See also Datasets-Support for supplementary detail on datasets (class name, size, licence, etc.) Dataset, #Subj, #Chan, #Classes, #Trials, Trial length, Freq, #Session, #Runs, Total_trials, PapersWithCode leaderboard
Columns definitions: * Dataset is the name of the dataset. * #Subj is the number of subjects. * #Chan is the number of EEG channels. * #Trials / class is the number of repetitions performed by one subject for each class. This number is computed using only the first subject of each dataset. The definitions of a **class* and of a trial depend on the paradigm used (see sections below)*. * Trial length is the duration of trial in seconds. * Total_trials is the total number of trials in the dataset (all subjects and classes together). * Freq is the sampling frequency of the raw data. * #Session is the number of sessions per subject. Different sessions are often recorded on different days. * #Runs is the number of runs per session. A run is a continuous recording of the EEG data. Often, the different runs of a given session are recorded without removing the EEG cap in between. * PapersWithCode leaderboard is the link to the dataset on the PapersWithCode leaderboard.
Motor Imagery#
Motor Imagery is a BCI paradigm where the subject imagines performing movements. Each movement is associated with a different command to build an application.
Motor Imagery-specific definitions: * #Classes is the number of different imagery tasks. * Trial is one repetition of the imagery task.
Dataset |
#Subj |
#Chan |
#Classes |
#Trials / class |
Trial length |
Freq |
#Session |
#Runs |
Total_trials |
PapersWithCode leaderboard |
---|---|---|---|---|---|---|---|---|---|---|
8 |
16 |
3 |
20 |
3s |
512Hz |
1 |
1 |
480 |
||
9 |
22 |
4 |
144 |
4s |
250Hz |
2 |
6 |
62208 |
||
14 |
15 |
2 |
80 |
5s |
512Hz |
1 |
8 |
17920 |
||
9 |
3 |
2 |
360 |
4.5s |
250Hz |
5 |
1 |
32400 |
||
12 |
13 |
2 |
200 |
5s |
512Hz |
3 |
1 |
14400 |
||
9 |
30 |
5 |
80 |
7s |
256Hz |
2 |
1 |
7200 |
||
52 |
64 |
2 |
100 |
3s |
512Hz |
1 |
1 |
9800 |
||
54 |
62 |
2 |
100 |
4s |
1000Hz |
2 |
1 |
11000 |
||
10 |
128 |
2 |
150 |
7s |
500Hz |
1 |
1 |
3000 |
||
14 |
128 |
4 |
120 |
4s |
500Hz |
1 |
2 |
13440 |
||
15 |
61 |
7 |
60 |
3s |
512Hz |
1 |
10 |
63000 |
No |
|
109 |
64 |
4 |
23 |
3s |
160Hz |
1 |
1 |
69760 |
||
29 |
30 |
2 |
30 |
10s |
200Hz |
3 |
1 |
5220 |
||
29 |
30 |
2 |
30 |
10s |
200Hz |
3 |
1 |
5220 |
No |
|
10 |
60 |
7 |
80 |
4s |
200Hz |
1 |
1 |
5600 |
||
4 |
14 |
3 |
160 |
5s |
250Hz |
3 |
2 |
11496 |
||
62 |
64 |
4 |
450 |
3s |
1000Hz |
7 or 11 |
1 |
250000 |
No |
|
50 |
29 |
2 |
20 |
4s |
500Hz |
1 |
1 |
2000 |
No |
P300/ERP#
ERP (Event-Related Potential) is a BCI paradigm where the subject is presented with a stimulus and the EEG response is recorded. The P300 is a positive peak in the EEG signal that occurs around 300 ms after the stimulus.
P300-specific definitions: * A trial is one flash. * The classes are binary: a trial is target if the key on which the subject focuses is flashed and non-target otherwise.
Dataset |
#Subj |
#Chan |
#Trials / class |
Trials length |
Sampling rate |
#Sessions |
PapersWithCode leaderboard |
---|---|---|---|---|---|---|---|
8 |
8 |
3500 NT / 700 T |
1s |
256Hz |
1 |
||
10 |
16 |
1440 NT / 288 T |
0.8s |
256Hz |
3 |
||
10 |
8 |
1500 NT / 300 T |
0.8s |
256Hz |
1 |
||
25 |
16 |
640 NT / 128 T |
1s |
128Hz |
2 |
||
24 |
16 |
3200 NT / 640 T |
1s |
512Hz |
8 for subjects 1-7 else 1 |
||
64 |
16 |
990 NT / 198 T |
1s |
512Hz |
up to 3 |
||
38 |
32 |
200 NT / 40 T |
1s |
512Hz |
3 |
||
43 |
32 |
4131 NT / 825 T |
1s |
512Hz |
3 |
||
44 |
32 |
2160 NT / 480 T |
1s |
512Hz |
1 |
||
21 |
16 |
600 NT / 120 T |
1s |
512Hz |
2 |
||
13 |
31 |
364 NT / 112 T |
0.9s |
1000Hz |
3 |
||
12 |
31 |
364 NT / 112 T |
0.9s |
1000Hz |
3 |
||
13 |
31 |
7500 NT / 1500 T |
1.2s |
1000Hz |
1 |
||
8 |
32 |
2753 NT / 551 T |
1s |
2048Hz |
4 |
||
54 |
62 |
6900 NT / 1380 T |
1s |
1000Hz |
2 |
||
60 |
8 |
935 NT / 50 T |
1s |
500Hz |
1 |
No |
SSVEP#
SSVEP (Steady-State Visually Evoked Potential) is a BCI paradigm where the subject is presented with flickering stimuli. The EEG signal is modulated at the same frequency as the stimulus. Each stimulus is flickering at a different frequency.
SSVEP-specific definitions: * #Classes is the number of different stimulation frequencies. * A trial is one symbol selection. This includes multiple flashes.
Dataset |
#Subj |
#Chan |
#Classes |
#Trials / class |
Trials length |
Sampling rate |
#Sessions |
PapersWithCode leaderboard |
---|---|---|---|---|---|---|---|---|
54 |
62 |
4 |
50 |
4s |
1000Hz |
2 |
||
12 |
8 |
4 |
16 |
2s |
256Hz |
1 |
||
10 |
256 |
5 |
12-15 |
3s |
250Hz |
1 |
||
10 |
256 |
5 |
20-30 |
3s |
250Hz |
1 |
||
10 |
14 |
4 |
20-30 |
3s |
128Hz |
1 |
||
9 |
8 |
12 |
15 |
4.15s |
256Hz |
1 |
||
34 |
62 |
40 |
6 |
5s |
250Hz |
1 |
c-VEP#
Include neuro experiments where the participant is presented with psuedo-random noise-codes, such as m-sequences, Gold codes, or any arbitrary “pseudo-random” code. Specifically, the difference with SSVEP is that SSVEP presents periodic stimuli, while c-VEP presents non-periodic stimuli. For a review of c-VEP BCI, see:
Martínez-Cagigal, V., Thielen, J., Santamaria-Vazquez, E., Pérez-Velasco, S., Desain, P.,& Hornero, R. (2021). Brain–computer interfaces based on code-modulated visual evoked potentials (c-VEP): A literature review. Journal of Neural Engineering, 18(6), 061002. DOI: https://doi.org/10.1088/1741-2552/ac38cf
c-VEP-specific definitions: * A trial is one symbol selection. This includes multiple flashes. * #Trial classes is the number of different symbols. * #Epoch classes is the number of possible intensities for the flashes (for a visual cVEP paradigm). Typically, there are only two intensities: on and off. * #Epochs / class the number of flashes per intensity in each session. * Codes is the type of code used in the experiment. * Presentation rate is the rate at which the codes are presented.
Dataset |
#Subj |
#Sessions |
Sampling rate |
#Chan |
Trials length |
#Trial classes |
#Trials / class |
#Epochs classes |
#Epochs / class |
Codes |
Presentation rate |
PapersWithCode leaderboard |
---|---|---|---|---|---|---|---|---|---|---|---|---|
12 |
1 |
2048Hz |
64 |
4.2s |
36 |
3 |
2 |
27216 NT / 27216 T |
Gold codes |
120Hz |
No |
|
30 |
1 |
512Hz |
8 |
31.5s |
20 |
5 |
2 |
18900 NT / 18900 T |
Gold codes |
60Hz |
No |
|
12 |
1 |
500Hz |
32 |
2.2s |
4 |
15/15/15/15 |
2 |
3525 NT / 3495 T |
m-sequence |
60Hz |
No |
|
12 |
1 |
500Hz |
32 |
2.2s |
4 |
15/15/15/15 |
2 |
3525 NT / 3495 T |
m-sequence |
60Hz |
No |
|
12 |
1 |
500Hz |
32 |
2.2s |
4 |
15/15/15/15 |
2 |
5820 NT / 1200 T |
Burst-CVEP |
60Hz |
No |
|
12 |
1 |
500Hz |
32 |
2.2s |
4 |
15/15/15/15 |
2 |
5820 NT / 1200 T |
Burst-CVEP |
60Hz |
No |
Resting States#
Include neuro experiments where the participant is not actively doing something. For example, recoding the EEG of a subject while s/he is having the eye closed or opened is a resting state experiment.
Dataset |
#Subj |
#Chan |
#Classes |
#Blocks / class |
Trials length |
Sampling rate |
#Sessions |
PapersWithCode leaderboard |
---|---|---|---|---|---|---|---|---|
12 |
16 |
2 |
10 |
60s |
512Hz |
1 |
No |
|
15 |
62 |
4 |
1 |
2s |
250Hz |
1 |
No |
|
20 |
16 |
2 |
5 |
10s |
512Hz |
1 |
No |
Compound Datasets#
Compound Datasets are datasets compounded with subjects from other datasets. It is useful for merging different datasets (including other Compound Datasets), select a sample of subject inside a dataset (e.g. subject with high/low performance).
Dataset |
#Subj |
#Original datasets |
---|---|---|
17 |
BI2014a |
|
11 |
BI2014b |
|
2 |
BI2015a |
|
25 |
BI2015b |
|
4 |
Cattan2019_VR |
|
59 |
|
Submit a new dataset#
you can submit a new dataset by mentioning it to this issue. The datasets currently on our radar can be seen here, but we are open to any suggestion.
If you want to actively contribute to inclusion of one new dataset, you can follow also this tutorial tutorial.