A dataset handle and abstract low level access to the data. the dataset will takes data stored locally, in the format in which they have been downloaded, and will convert them into a MNE raw object. There are options to pool all the different recording sessions per subject or to evaluate them separately.

See http://moabb.neurotechx.com/docs/dataset_summary.html for detail on datasets (electrodes, number of trials, sessions, etc.)

Data Summary#

MOABB gather many datasets, here is list summarizing important information. Most of the datasets are listed here but this list not complete yet, check API for complete documentation.

Do not hesitate to help us complete this list. It is also possible to add new datasets, there is a tutorial explaining how to do so, and we welcome warmly any new contributions!

It is possible to use an external dataset within MOABB as long as it is in Brain Imaging Data Structure (BIDS) format. See this guide for more information on how to structure your data according to BIDS You can use this class to convert your local dataset to work within MOABB without creating a new dataset class.

See also the Wiki for supplementary detail on datasets (class name, size, licence, etc.) Dataset, #Subj, #Chan, #Classes, #Trials, Trial length, Freq, #Session, #Runs, Total_trials

Column definitions:

  • Dataset is the name of the dataset.

  • #Subj is the number of subjects.

  • #Chan is the number of EEG channels.

  • #Trials / class is the number of repetitions performed by one subject for each class. This number is computed using only the first subject of each dataset. The definitions of a **class* and of a trial depend on the paradigm used (see sections below)*.

  • Trials length is the duration of trial in seconds.

  • Total_trials is the total number of trials in the dataset (all subjects and classes together).

  • Freq is the sampling frequency of the raw data.

  • #Session is the number of sessions per subject. Different sessions are often recorded on different days.

  • #Runs is the number of runs per session. A run is a continuous recording of the EEG data. Often, the different runs of a given session are recorded without removing the EEG cap in between.

Datasets overview:

A visual overview of all datasets can be generated using the functions moabb.datasets.utils.plot_datasets_grid() or moabb.datasets.utils.plot_datasets_cluster(). This overview allows to quickly compare the number of subjects, trials, and sessions across different datasets. The function will generate a figure like this:

Visual overview from the datasets used on the `The largest EEG-based BCI reproducibility study for open science: the MOABB benchmark <https://universite-paris-saclay.hal.science/hal-04537061v1/file/MOABB-arXiv.pdf>`_

Motor Imagery#

Motor Imagery is a BCI paradigm where the subject imagines performing movements. Each movement is associated with a different command to build an application.

Motor Imagery-specific definitions:

  • #Classes is the number of different imagery tasks.

  • Trial is one repetition of the imagery task.

Dataset

#Subj

#Chan

#Classes

#Trials / class

Trials length (s)

Freq (Hz)

#Sessions

#Runs

Total_trials

AlexMI

8

16

3

20

3.0

512

1

1

480

BNCI2003_004

5

118

2

84

3.5

100

1

1

1400

BNCI2014_001

9

22

4

144

4.0

250

2

6

62208

BNCI2014_002

14

15

2

80

5.0

512

1

8

17920

BNCI2014_004

9

3

2

360

4.5

250

5

1

32400

BNCI2015_001

12

13

2

200

5.0

512

3

1

14400

BNCI2015_004

9

30

5

80

7.0

256

2

1

7200

Cho2017

52

64

2

100

3.0

512

1

1

9800

Lee2019_MI

54

62

2

100

4.0

1000

2

1

11000

GrosseWentrup2009

10

128

2

150

7.0

500

1

1

3000

Schirrmeister2017

14

128

4

120

4.0

500

1

2

13440

Ofner2017

15

61

7

60

3.0

512

1

10

63000

PhysionetMI

109

64

4

23

3.0

160

1

1

69760

Shin2017A

29

30

2

30

10.0

200

3

1

5220

Shin2017B

29

30

2

30

10.0

200

3

1

5220

Weibo2014

10

60

7

80

4.0

200

1

1

5600

Zhou2016

4

14

3

160

5.0

250

3

2

11496

Stieger2021

62

64

4

450

3.0

1000

7 or 11

1

250000

Liu2024

50

29

2

20

4.0

500

1

1

2000

Beetl2021_A

4

63

4

224

4.0

500

1

1

1490

Beetl2021_B

2

32

4

160

4.0

200

1

1

1590

Dreyer2023A

60

27

2

20

5.0

512

1

6

14400

Dreyer2023B

21

27

2

20

5.0

512

1

6

5040

Dreyer2023C

6

27

2

20

5.0

512

1

6

1440

Dreyer2023

87

27

2

20

5.0

512

1

6

20880

BNCI2020_001

15

varies (11-64)

3

80

5.0

256

3

4

7200

BNCI2022_001

13

67

4

varies

90.0

256

1

1

varies

BNCI2024_001

20

64

10

varies

3.0

512

1

1

varies

BNCI2025_001

20

64

16

varies

4.0

500

1

1

varies

BNCI2025_002

20

64

3

varies

8.0

200

3

1

varies

BNCI2019_001

10

64

5

varies

3.0

256

1

9

varies

Tavakolan2017

12

32

3

20

3.0

1000

4

1

2880

Zhang2017

12

17

10

30

4.0

1000

1

15

4321

Kaya2018

7

19

3

80

1.0

200

3

1

16126

Kumar2024

18

22

2

varies

5.0

512

6

varies

7156

Rozado2015

30

32

2

25

6.0

512

1

2

1550

Brandl2020

16

63

2

36

4.5

1000

1

7

8064

Ma2020

25

62

2

300

4.0

200

15

1

15000

Wairagkar2018

14

19

3

40

6.0

1024

1

1

1665

Zhou2020

20

26

4

60

5.0

500

7

6

33600

Wu2020

6

132

2

varies

4.0

1000

1

varies

1114

Jeong2020

25

60

11

50

4.0

2500

3

3

41250

Yang2025

62

59

3

100

7.5

1000

3

1

39600

Forenzo2023

25

64

2

varies

60.0

1000

5

3

1875

TrianaGuzman2024

32

17

4

varies

15.0

250

1

1

7680

Chang2025

28

59

3

40

6.0

1000

4

1

13440

GuttmannFlury2025_ME

31

62

2

20

7.5

1000

3

1

2520

GuttmannFlury2025_MI

31

62

2

20

7.5

1000

3

1

2520

HefmiIch2025

37

32

2

15

27.0

256

3

1

3330

Yi2025

18

62

8

40

4.0

250

1

1

5760

Gao2026

22

32

10

40

4.0

1000

2

3

16800

Zuo2025

30

30

2

50

4.0

500

5

1

15000

Liu2025

27

64

2

40

5.0

1000

3

4

8640

P300/ERP#

ERP (Event-Related Potential) is a BCI paradigm where the subject is presented with a stimulus and the EEG response is recorded. The P300 is a positive peak in the EEG signal that occurs around 300 ms after the stimulus.

P300-specific definitions:

  • A trial is one flash.

  • The classes are binary: a trial is target if the key on which the subject focuses is flashed and non-target otherwise.

Dataset

#Subj

#Chan

#Trials / class

Trials length (s)

Freq (Hz)

#Sessions

BNCI2014_008

8

8

3500 NT / 700 T

1.0

256.0

1

BNCI2014_009

10

16

1440 NT / 288 T

0.8

256.0

3

BNCI2015_003

10

8

1500 NT / 300 T

0.8

256.0

1

BI2012

25

16

640 NT / 128 T

1.0

128.0

2

BI2013a

24

16

3200 NT / 640 T

1.0

512.0

8 for subjects 1-7 else 1

BI2014a

64

16

990 NT / 198 T

1.0

512.0

up to 3

BI2014b

38

32

200 NT / 40 T

1.0

512.0

3

BI2015a

43

32

4131 NT / 825 T

1.0

512.0

3

BI2015b

44

32

2160 NT / 480 T

1.0

512.0

1

Cattan2019_VR

21

16

600 NT / 120 T

1.0

512.0

2

Huebner2017

13

31

364 NT / 112 T

0.9

1000.0

3

Huebner2018

12

31

364 NT / 112 T

0.9

1000.0

3

Sosulski2019

13

31

7500 NT / 1500 T

1.2

1000.0

1

EPFLP300

8

32

2753 NT / 551 T

1.0

2048.0

4

Lee2019_ERP

54

62

6900 NT / 1380 T

1.0

1000.0

2

ErpCore2021

40

30

varies by ERP task

1.0

1024.0

1

ErpCore2021_N170

40

30

240 NT / 80 T

1.0

1024.0

1

ErpCore2021_MMN

40

30

800 NT / 200 T

1.0

1024.0

1

ErpCore2021_N2pc

40

30

160 NT / 160 T

1.0

1024.0

1

ErpCore2021_P3

40

30

160 NT / 40 T

1.0

1024.0

1

ErpCore2021_N400

40

30

60 NT / 60 T

1.0

1024.0

1

ErpCore2021_ERN

40

30

~400 All

1.0

1024.0

1

ErpCore2021_LRP

40

30

~400 All

1.0

1024.0

1

Kojima2024A

11

64

~130 NT / ~65 T

1.0

1000.0

1

Kojima2024B

15

64

2160 NT / 720 T

1.0

1000.0

1

RomaniBF2025ERP

22

8

540 NT / 60 T

1.0

250.0

up to 3

BNCI2015_009

21

62

10071 NT / 2014 T

0.8

1000.0

2

BNCI2015_010

12

63

18850 NT / 650 T

0.8

1000.0

1

BNCI2015_012

10

63

6075 NT / 759 T

1.0

1000.0

1

BNCI2015_013

6

64

809 NT / 235 T

0.6

512.0

2

BNCI2016_002

15

69

varies brake / EMG

1.5

200.0

1

BNCI2020_002

18

31

varies NT / T

16.0

250.0

1

BNCI2015_006

11

64

~875 NT / ~856 T

1.0

200.0

1

BNCI2015_007

16

63

varies NT / T

0.7

100.0

1

BNCI2015_008

13

63

varies NT / T

1.0

250.0

2

Lee2021Mobile_ERP

24

32

250 NT / 50 T

1.0

500.0

5 for subjects 1-18 else 4

GuttmannFlury2025_P300

31

66

varies NT / T

1.0

1000.0

1-3

Mainsah2025_A

13

32

varies NT / T

1.0

256.0

1

Mainsah2025_B

19

16

varies NT / T

1.0

256.0

up to 8

Mainsah2025_C

19

32

varies NT / T

1.0

256.0

1

Mainsah2025_D

17

32

varies NT / T

1.0

256.0

1

Mainsah2025_E

8

16

varies NT / T

1.0

256.0

1

Mainsah2025_F

10

16

varies NT / T

1.0

256.0

3

Mainsah2025_G

20

16

varies NT / T

1.0

256.0

1

Mainsah2025_H

16

16

varies NT / T

1.0

256.0

1

Mainsah2025_I

13

16

varies NT / T

1.0

256.0

1

Mainsah2025_J

20

16

varies NT / T

1.0

256.0

1

Mainsah2025_K

5

16

varies NT / T

1.0

256.0

up to 2

Mainsah2025_L

11

16

varies NT / T

1.0

256.0

1

Mainsah2025_M

21

16

varies NT / T

1.0

256.0

1

Mainsah2025_N

8

16

varies NT / T

1.0

256.0

2

Mainsah2025_O

18

32

varies NT / T

1.0

256.0

2

Mainsah2025_P

19

32

varies NT / T

1.0

256.0

2

Mainsah2025_Q

36

32

varies NT / T

1.0

256.0

1

Mainsah2025_R

20

32

varies NT / T

1.0

256.0

2

Mainsah2025_S1

10

32

varies NT / T

1.0

256.0

1

Mainsah2025_S2

24

32

varies NT / T

1.0

256.0

1

Chailloux2020

19

8

varies NT / T

1.0

256.0

3

Lee2024_TV

30

31

varies NT / T

1.0

500.0

1

Lee2024_DL

15

31

varies NT / T

1.0

500.0

1

Lee2024_EL

15

31

varies NT / T

1.0

500.0

1

Lee2024_BS

14

31

varies NT / T

1.0

500.0

1

Lee2024_AC

10

25

varies NT / T

1.0

500.0

1

Zheng2020

14

62

168 T / 4032 NT

1.0

1000.0

2

Zhang2025

15

57

varies T / NT

1.0

1000.0

4

Kaneshiro2015

10

124

5184 (6 classes)

0.496

62.5

1

Simoes2020

15

8

varies NT / T

1.0

250.0

7

Speier2017

10

32

~1200 per run

1.0

256.0

2

SSVEP#

SSVEP (Steady-State Visually Evoked Potential) is a BCI paradigm where the subject is presented with flickering stimuli. The EEG signal is modulated at the same frequency as the stimulus. Each stimulus is flickering at a different frequency.

SSVEP-specific definitions:

  • #Classes is the number of different stimulation frequencies.

  • A trial is one symbol selection. This includes multiple flashes.

Dataset

#Subj

#Chan

#Classes

#Trials / class

Trials length (s)

Freq (Hz)

#Sessions

Lee2019_SSVEP

54

62

4

50

4.0

1000

2

Kalunga2016

12

8

4

16

2.0

256

1

MAMEM1

10

256

5

12-15

3.0

250

1

MAMEM2

10

256

5

20-30

3.0

250

1

MAMEM3

10

14

4

20-30

3.0

128

1

Nakanishi2015

9

8

12

15

4.15

256

1

Wang2016

34

64

40

6

5.0

250

1

Liu2020BETA

70

64

40

4

3.0

250

1

Liu2022EldBETA

100

64

9

7

6.0

1000

7

Kim2025BetaRange

40

33

40

6

5.0

1024

6

Dong2023

59

8

40

4

4.0

250

1

Lee2021Mobile_SSVEP

24

32

3

20

5.0

500

4-5

Chen2017SingleFlicker

12

32

4

varies

3.5

512/2048

2

Wang2021Combined

8

32

4

varies

5.0

1000

1

Han2024Fatigue

24

64

32

6-24

2.0

1000

2

GuttmannFlury2025_SSVEP

31

66

4

12

5.0

1000

1-3

c-VEP#

Include neuro experiments where the participant is presented with psuedo-random noise-codes, such as m-sequences, Gold codes, or any arbitrary “pseudo-random” code. Specifically, the difference with SSVEP is that SSVEP presents periodic stimuli, while c-VEP presents non-periodic stimuli. For a review of c-VEP BCI, see:

Martínez-Cagigal, V., Thielen, J., Santamaria-Vazquez, E., Pérez-Velasco, S., Desain, P.,& Hornero, R. (2021). Brain–computer interfaces based on code-modulated visual evoked potentials (c-VEP): A literature review. Journal of Neural Engineering, 18(6), 061002. DOI: https://doi.org/10.1088/1741-2552/ac38cf

c-VEP-specific definitions:

  • A trial is one symbol selection. This includes multiple flashes.

  • #Trial classes is the number of different symbols.

  • #Epoch classes is the number of possible intensities for the flashes (for a visual cVEP paradigm). Typically, there are only two intensities: on and off.

  • #Epochs / class the number of flashes per intensity in each session.

  • Codes is the type of code used in the experiment.

  • Presentation rate is the rate at which the codes are presented.

Dataset

#Subj

#Sessions

Freq (Hz)

#Chan

Trials length (s)

#Trial classes

#Trials / class

#Epochs classes

#Epochs / class

Codes

Presentation rate (Hz)

MartinezCagigal2023Pary

16

5

256

16

5.3/6.7/10.3/4.0/10.0

16

2-30

2-11

6200-19220

p-ary m-sequence

120

MartinezCagigal2023Checker

16

8

256

16

4.2

16

2-30

2

11904/12288

m-sequence

120

Thielen2015

12

1

2048

64

4.2

36

3

2

27216 NT / 27216 T

Gold codes

120

Thielen2021

30

1

512

8

31.5

20

5

2

18900 NT / 18900 T

Gold codes

60

CastillosCVEP100

12

1

500

32

2.2

4

15/15/15/15

2

3525 NT / 3495 T

m-sequence

60

CastillosCVEP40

12

1

500

32

2.2

4

15/15/15/15

2

3525 NT / 3495 T

m-sequence

60

CastillosBurstVEP40

12

1

500

32

2.2

4

15/15/15/15

2

5820 NT / 1200 T

Burst-CVEP

60

CastillosBurstVEP100

12

1

500

32

2.2

4

15/15/15/15

2

5820 NT / 1200 T

Burst-CVEP

60

Resting States#

Include neuro experiments where the participant is not actively doing something. For example, recoding the EEG of a subject while s/he is having the eye closed or opened is a resting state experiment.

Dataset

#Subj

#Chan

#Classes

#Blocks / class

Trials length (s)

Freq (Hz)

#Sessions

Cattan2019_PHMD

12

16

2

5

60

512

1

Hinss2021

15

62

4

1

2

250

1

Rodrigues2017

20

16

2

5

10

512

1

Compound Datasets#

Compound Datasets are datasets compounded with subjects from other datasets. It is useful for merging different datasets (including other Compound Datasets), select a sample of subject inside a dataset (e.g. subject with high/low performance).

Dataset

#Subj

#Original datasets

BI2014a_Il

17

BI2014a

BI2014b_Il

11

BI2014b

BI2015a_Il

2

BI2015a

BI2015b_Il

25

BI2015b

Cattan2019_VR_Il

4

Cattan2019_VR

BI_Il

59

BI2014a_Il BI2014b_Il BI2015a_Il BI2015b_Il Cattan2019_VR_Il

Submit a new dataset#

you can submit a new dataset by mentioning it to this issue. The datasets currently on our radar can be seen here, but we are open to any suggestion.

If you want to actively contribute to inclusion of one new dataset, you can follow also this tutorial tutorial.