Convert a MOABB dataset to BIDS#

The Brain Imaging Data Structure (BIDS) format is standard for storing neuroimaging data. It follows fixed principles to facilitate the sharing of neuroimaging data between researchers.

The MOABB library allows to convert any MOABB dataset to BIDS [1] and [2].

In this example, we will convert the AlexMI dataset to BIDS using the option cache_config=dict(path=temp_dir, save_raw=True) of the get_data method from the dataset object.

This will automatically save the raw data in the BIDS format and allow to use a cache for the next time the dataset is used.

We will use the AlexMI dataset [3], one of the smallest in people and one that can be downloaded quickly.

# Authors: Pierre Guetschel <>
# License: BSD (3-clause)

import shutil
import tempfile
from pathlib import Path

import mne

from moabb import set_log_level
from moabb.datasets import AlexMI


Basic usage#

Here, we will save the BIDS version of the dataset in a temporary folder

temp_dir = Path(tempfile.mkdtemp())
# The conversion of any MOABB dataset to a BIDS-compliant structure can be done
# by simply calling its ``get_data`` method and using the ``cache_config``
# parameter. This parameter is a dictionary.
dataset = AlexMI()
# Reducing the number of subjects to speed up the example

dataset.subject_list = dataset.subject_list[:1]
_ = dataset.get_data(cache_config=dict(path=temp_dir, save_raw=True))
  0%|                                              | 0.00/17.3M [00:00<?, ?B/s]
  0%|                                     | 2.05k/17.3M [00:00<16:38, 17.4kB/s]
  1%|▎                                      | 115k/17.3M [00:00<00:41, 416kB/s]
  2%|▊                                     | 389k/17.3M [00:00<00:15, 1.07MB/s]
  5%|█▉                                    | 900k/17.3M [00:00<00:08, 1.99MB/s]
 12%|████▌                                | 2.14M/17.3M [00:00<00:03, 4.84MB/s]
 25%|█████████▎                           | 4.39M/17.3M [00:00<00:01, 9.78MB/s]
 46%|█████████████████▏                   | 8.03M/17.3M [00:00<00:00, 17.4MB/s]
 65%|███████████████████████▉             | 11.2M/17.3M [00:00<00:00, 21.6MB/s]
 84%|██████████████████████████████▉      | 14.5M/17.3M [00:01<00:00, 24.9MB/s]
 99%|████████████████████████████████████▋| 17.2M/17.3M [00:01<00:00, 25.5MB/s]
  0%|                                              | 0.00/17.3M [00:00<?, ?B/s]
100%|█████████████████████████████████████| 17.3M/17.3M [00:00<00:00, 99.3GB/s]

Before / after folder structure#

To investigate what was saved, we will first define a function to print the folder structure of a given path:

def print_tree(p: Path, last=True, header=""):
    elbow = "└──"
    pipe = "│  "
    tee = "├──"
    blank = "   "
    print(header + (elbow if last else tee) +
    if p.is_dir():
        children = list(p.iterdir())
        for i, c in enumerate(children):
                c, header=header + (blank if last else pipe), last=i == len(children) - 1

Now, we will retrieve the location of the original dataset. It is stored in the MNE data directory, which can be found with the "MNE_DATA" key:

mne_data = Path(mne.get_config("MNE_DATA"))
print(f"MNE data directory: {mne_data}")
MNE data directory: /home/runner/mne_data

Now, we can print the folder structure of the original dataset:

print("Before conversion:")
print_tree(mne_data / "MNE-alexeeg-data")
Before conversion:

As we can see, before conversion, all the data (i.e. from all subjects, sessions and runs) is stored in a single folder. This follows no particular standard and can vary from one dataset to another.

After conversion, the data is stored in a BIDS-compliant way:

print("After conversion:")
print_tree(temp_dir / "MNE-BIDS-alexandre-motor-imagery")
After conversion:
   │  ├──sub-1_desc-2f4a2e6207d30d406bb04b5b1aae5195_lockfile.json
   │  └──ses-0
   │     ├──eeg
   │     │  ├──sub-1_ses-0_task-imagery_run-0_desc-2f4a2e6207d30d406bb04b5b1aae5195_events.json
   │     │  ├──sub-1_ses-0_task-imagery_run-0_desc-2f4a2e6207d30d406bb04b5b1aae5195_channels.tsv
   │     │  ├──sub-1_ses-0_task-imagery_run-0_desc-2f4a2e6207d30d406bb04b5b1aae5195_events.tsv
   │     │  ├──sub-1_ses-0_task-imagery_run-0_desc-2f4a2e6207d30d406bb04b5b1aae5195_eeg.json
   │     │  └──sub-1_ses-0_task-imagery_run-0_desc-2f4a2e6207d30d406bb04b5b1aae5195_eeg.edf
   │     └──sub-1_ses-0_scans.tsv

In the BIDS version of our dataset, the raw files are saved in EDF. The data is organized in a hierarchy of folders, starting with the subjects, then the sessions, and then the runs. Metadata files are stored to describe the data. For more details on the BIDS structure, please refer to the BIDS website and the BIDS spec.

Under the hood, saving datasets to BIDS is done through the caching system of MOABB. Only raw EEG files are officially supported by the BIDS specification. However, MOABB’s caching mechanism also offers the possibility to save the data in a pseudo-BIDS after different preprocessing steps. In particular, we can save mne.Epochs and np.ndarray objects. For more details on the caching system, please refer to the tutorial Cache on disk intermediate data processing states.


Finally, we can delete the temporary folder:


Total running time of the script: (0 minutes 2.867 seconds)

Estimated memory usage: 280 MB

Gallery generated by Sphinx-Gallery