moabb.datasets.Stieger2021#

class moabb.datasets.Stieger2021(interval=[0, 3], sessions=None, fix_bads=True, subjects=None, **kwargs)[source]#

Bases: BaseDataset

[source]

Dataset Snapshot

Stieger2021

Continuous sensorimotor rhythm based brain computer interface learning in a large population

Motor Imagery, 4 classes (right_hand vs left_hand vs both_hand vs rest)

AuthorsJames R. Stieger, Stephen A. Engel, Bin He

πŸ‡ΊπŸ‡Έβ€‚Carnegie Mellon University, University of Minnesota, USΒ·2021Β·bhe1@andrew.cmu.edu
Motor Imagery Code: Stieger2021 62 subjects 11 sessions 62 ch 1000 Hz 4 classes 3.0 s trials CC BY-NC 4.0

Class Labels: right_hand, left_hand, both_hand, rest

Overview

Motor Imagery dataset from Stieger et al. 2021

The main goals of our original study were to characterize how individuals learn to control SMR-BCIs and to test whether this learning can be improved through behavioral interventions such as mindfulness training. Participants were initially assessed for baseline BCI proficiency and then randomly assigned to an 8-week mindfulness intervention (Mindfulness-based stress reduction), or waitlist control condition where participants waited for the same duration as the MBSR class before starting BCI training, but were offered a comparable MBSR course after completing all experimental requirements. Following the 8-weeks, participants returned to the lab for 6 to 10 sessions of BCI training.

All experiments were approved by the institutional review boards of the University of Minnesota and Carnegie Mellon University. Informed consents were obtained from all subjects. In total, 144 participants were enrolled in the study and 76 participants completed all experimental requirements. Seventy-two participants were assigned to each intervention by block randomization, with 42 participants completing all sessions in the experimental group (MBSR before BCI training; MBSR subjects) and 34 completing experimentation in the control group. Four subjects were excluded from the analysis due to non-compliance with the task demands and one was excluded due to experimenter error. We were primarily interested in how individuals learn to control BCIs, therefore analysis focused on those that did not demonstrate ceiling performance in the baseline BCI assessment (accuracy above 90% in 1D control). The dataset descriptor presented here describes data collected from 62 participants: 33 MBSR participants (Age=42+/-15, (F)emale=26) and 29 controls (Age=36+/-13, F=23). In the United States, women are twice as likely to practice meditation compared to men. Therefore, the gender imbalance in our study may result from a greater likelihood of women to respond to flyers offering a meditation class in exchange for participating in our study.

For all BCI sessions, participants were seated comfortably in a chair and faced a computer monitor that was placed approximately 65cm in front of them. After the EEG capping procedure (see data acquisition), the BCI tasks began. Before each task, participants received the appropriate instructions. During the BCI tasks, users attempted to steer a virtual cursor from the center of the screen out to one of four targets. Participants initially received the following instructions: β€œImagine your left (right) hand opening and closing to move the cursor left (right). Imagine both hands opening and closing to move the cursor up. Finally, to move the cursor down, voluntarily rest; in other words, clear your mind.” In separate blocks of trials, participants directed the cursor toward a target that required left/right (LR) movement only, up/down (UD) only, and combined 2D movement (2D)30. Each experimental block (LR, UD, 2D) consisted of 3 runs, where each run was composed of 25 trials. After the first three blocks, participants were given a short break (5-10 minutes) that required rating comics by preference. The break task was chosen to standardize subject experience over the break interval. Following the break, participants competed the same 3 blocks as before. In total, each session consisted of 2 blocks of each task (6 runs total of LR, UD, and 2D control), which culminated in 450 trials performed each day.

Online BCI control of the cursor proceeded in a series of steps. The first step, feature extraction, consisted of spatial filtering and spectrum estimation. During spatial filtering, the average signal of the 4 electrodes surrounding the hand knob of the motor cortex was subtracted from electrodes C3 and C4 to reduce the spatial noise. Following spatial filtering, the power spectrum was estimated by fitting an autoregressive model of order 16 to the most recent 160 ms of data using the maximum entropy method. The goal of this method is to find the coefficients of a linear all-pole filter that, when applied to white noise, reproduces the data's spectrum. The main advantage of this method is that it produces high frequency resolution estimates for short segments of data. The parameters are found by minimizing (through least squares) the forward and backward prediction errors on the input data subject to the constraint that the filter used for estimation shares the same autocorrelation sequence as the input data. Thus, the estimated power spectrum directly corresponds to this filter's transfer function divided by the signal's total power. Numerical integration was then used to find the power within a 3 Hz bin centered within the alpha rhythm (12 Hz). The translation algorithm, the next step in the pipeline, then translated the user's alpha power into cursor movement. Horizontal motion was controlled by lateralized alpha power (C4 - C3) and vertical motion was controlled by up and down regulating total alpha power (C4 + C3). These control signals were normalized to zero mean and unit variance across time by subtracting the signals' mean and dividing by its standard deviation. A balanced estimate of the mean and standard deviation of the horizontal and vertical control signals was calcu- lated by estimating these values across time from data derived from 30 s buffers of individual trial type (e.g., the normalized control signal should be positive for right trials and negative for left trials, but the average of left and right trials should be zero). Finally, the normalized control signals were used to update the position of the cursor every 40 ms.

Citation & Impact

Stimulus Protocol
../_images/Stieger2021.svg

3s task window per trial Β· 4-class motor imagery paradigm Β· 1 runs/session across 11 sessions

HED Event Tags
HED tags4/4 events annotated

Source: MOABB BIDS HED annotation mapping.

Sensory-event
4
Agent-action
3
Experimental-stimulus
1
Rest
1
Visual-presentation
1
right_hand
Sensory-eventAgent-action
left_hand
Sensory-eventAgent-action
both_hand
Sensory-eventAgent-action
rest
Sensory-eventExperimental-stimulusVisual-presentationRest

HED tree view

Tree Β· right_hand
β”œβ”€ Sensory-event
β”‚  β”œβ”€ Experimental-stimulus
β”‚  └─ Visual-presentation
└─ Agent-action
   └─ Imagine
      β”œβ”€ Move
      └─ Right
         └─ Hand
Tree Β· left_hand
β”œβ”€ Sensory-event
β”‚  β”œβ”€ Experimental-stimulus
β”‚  └─ Visual-presentation
└─ Agent-action
   └─ Imagine
      β”œβ”€ Move
      └─ Left
         └─ Hand
Tree Β· both_hand
β”œβ”€ Sensory-event
β”‚  β”œβ”€ Experimental-stimulus
β”‚  └─ Visual-presentation
└─ Agent-action
   └─ Imagine
      β”œβ”€ Move
      └─ Hand
Tree Β· rest
β”œβ”€ Sensory-event
β”œβ”€ Experimental-stimulus
β”œβ”€ Visual-presentation
└─ Rest
Channel Summary
Total channels62
EEG62 (EEG)
Montage10-10
Sampling1000 Hz
Filter0.1 to 200 Hz with 60 Hz notch filter
Notch / line60 Hz

This diagram is automatically generated from MOABB metadata. Please consult the original publication to confirm the experimental protocol details.

Motor Imagery dataset from Stieger et al. 2021 [1].

The main goals of our original study were to characterize how individuals learn to control SMR-BCIs and to test whether this learning can be improved through behavioral interventions such as mindfulness training. Participants were initially assessed for baseline BCI proficiency and then randomly assigned to an 8-week mindfulness intervention (Mindfulness-based stress reduction), or waitlist control condition where participants waited for the same duration as the MBSR class before starting BCI training, but were offered a comparable MBSR course after completing all experimental requirements. Following the 8-weeks, participants returned to the lab for 6 to 10 sessions of BCI training.

All experiments were approved by the institutional review boards of the University of Minnesota and Carnegie Mellon University. Informed consents were obtained from all subjects. In total, 144 participants were enrolled in the study and 76 participants completed all experimental requirements. Seventy-two participants were assigned to each intervention by block randomization, with 42 participants completing all sessions in the experimental group (MBSR before BCI training; MBSR subjects) and 34 completing experimentation in the control group. Four subjects were excluded from the analysis due to non-compliance with the task demands and one was excluded due to experimenter error. We were primarily interested in how individuals learn to control BCIs, therefore analysis focused on those that did not demonstrate ceiling performance in the baseline BCI assessment (accuracy above 90% in 1D control). The dataset descriptor presented here describes data collected from 62 participants: 33 MBSR participants (Age=42+/-15, (F)emale=26) and 29 controls (Age=36+/-13, F=23). In the United States, women are twice as likely to practice meditation compared to men. Therefore, the gender imbalance in our study may result from a greater likelihood of women to respond to flyers offering a meditation class in exchange for participating in our study.

For all BCI sessions, participants were seated comfortably in a chair and faced a computer monitor that was placed approximately 65cm in front of them. After the EEG capping procedure (see data acquisition), the BCI tasks began. Before each task, participants received the appropriate instructions. During the BCI tasks, users attempted to steer a virtual cursor from the center of the screen out to one of four targets. Participants initially received the following instructions: β€œImagine your left (right) hand opening and closing to move the cursor left (right). Imagine both hands opening and closing to move the cursor up. Finally, to move the cursor down, voluntarily rest; in other words, clear your mind.” In separate blocks of trials, participants directed the cursor toward a target that required left/right (LR) movement only, up/down (UD) only, and combined 2D movement (2D)30. Each experimental block (LR, UD, 2D) consisted of 3 runs, where each run was composed of 25 trials. After the first three blocks, participants were given a short break (5-10 minutes) that required rating comics by preference. The break task was chosen to standardize subject experience over the break interval. Following the break, participants competed the same 3 blocks as before. In total, each session consisted of 2 blocks of each task (6 runs total of LR, UD, and 2D control), which culminated in 450 trials performed each day.

Online BCI control of the cursor proceeded in a series of steps. The first step, feature extraction, consisted of spatial filtering and spectrum estimation. During spatial filtering, the average signal of the 4 electrodes surrounding the hand knob of the motor cortex was subtracted from electrodes C3 and C4 to reduce the spatial noise. Following spatial filtering, the power spectrum was estimated by fitting an autoregressive model of order 16 to the most recent 160 ms of data using the maximum entropy method. The goal of this method is to find the coefficients of a linear all-pole filter that, when applied to white noise, reproduces the data’s spectrum. The main advantage of this method is that it produces high frequency resolution estimates for short segments of data. The parameters are found by minimizing (through least squares) the forward and backward prediction errors on the input data subject to the constraint that the filter used for estimation shares the same autocorrelation sequence as the input data. Thus, the estimated power spectrum directly corresponds to this filter’s transfer function divided by the signal’s total power. Numerical integration was then used to find the power within a 3 Hz bin centered within the alpha rhythm (12 Hz). The translation algorithm, the next step in the pipeline, then translated the user’s alpha power into cursor movement. Horizontal motion was controlled by lateralized alpha power (C4 - C3) and vertical motion was controlled by up and down regulating total alpha power (C4 + C3). These control signals were normalized to zero mean and unit variance across time by subtracting the signals’ mean and dividing by its standard deviation. A balanced estimate of the mean and standard deviation of the horizontal and vertical control signals was calcu- lated by estimating these values across time from data derived from 30 s buffers of individual trial type (e.g., the normalized control signal should be positive for right trials and negative for left trials, but the average of left and right trials should be zero). Finally, the normalized control signals were used to update the position of the cursor every 40 ms.

References

[1]

Stieger, J. R., Engel, S. A., & He, B. (2021). Continuous sensorimotor rhythm based brain computer interface learning in a large population. Scientific Data, 8(1), 98. https://doi.org/10.1038/s41597-021-00883-1

from moabb.datasets import Stieger2021
dataset = Stieger2021()
data = dataset.get_data(subjects=[1])
print(data[1])

Dataset summary

#Subj

62

#Chan

64

#Classes

4

#Trials / class

450

Trials length

3 s

Freq

1000 Hz

#Sessions

7 or 11

#Runs

1

Total_trials

250000

Participants

  • Population: healthy

  • Handedness: mostly right-handed

Equipment

  • Amplifier: Neuroscan SynAmps RT amplifiers

  • Electrodes: EEG

  • Montage: 10-10

Preprocessing

  • Data state: raw

Data Access

Experimental Protocol

  • Paradigm: imagery

  • Tasks: LR, UD, 2D

  • Feedback: visual

  • Stimulus: target_bar

Notes

Added in version 1.1.0.

__init__(interval=[0, 3], sessions=None, fix_bads=True, subjects=None, **kwargs)[source]#

Initialize Stieger2021 dataset.

Parameters:
  • interval (list of float, default=[0, 3]) – Epoch interval [tmin, tmax] in seconds relative to stimulus onset. Because trials in this dataset have variable lengths (roughly 0.04 s to 6 s), epochs whose trial is shorter than tmax are automatically rejected. Use get_trial_info() to inspect per-subject trial-length distributions and suggest_interval() to pick an interval that retains a desired fraction of trials.

  • sessions (list of int or None) – Sessions to load.

  • fix_bads (bool) – If True, bad channels are interpolated.

  • subjects (list of int or None) – Subjects to load.

property all_subjects#

Full list of subjects available in this dataset (unfiltered).

convert_to_bids(path=None, subjects=None, overwrite=False, format='EDF', verbose=None)[source]#

Convert the dataset to BIDS format.

Saves the raw EEG data in a BIDS-compliant directory structure. Unlike the caching mechanism (see CacheConfig), the files produced here do not contain a processing-pipeline hash (desc-<hash>) in their names, making the output a clean, shareable BIDS dataset.

Parameters:
  • path (str | Path | None) – Directory under which the BIDS dataset will be written. If None the default MNE data directory is used (same default as the rest of MOABB).

  • subjects (list of int | None) – Subject numbers to convert. If None, all subjects in subject_list are converted.

  • overwrite (bool) – If True, existing BIDS files for a subject are removed before saving. Default is False.

  • format (str) – The file format for the raw EEG data. Supported values are "EDF" (default), "BrainVision", and "EEGLAB".

  • verbose (str | None) – Verbosity level forwarded to MNE/MNE-BIDS.

Returns:

bids_root – Path to the root of the written BIDS dataset.

Return type:

pathlib.Path

Examples

>>> from moabb.datasets import AlexMI
>>> dataset = AlexMI()
>>> bids_root = dataset.convert_to_bids(path='/tmp/bids', subjects=[1])

See also

CacheConfig

Cache configuration for get_data().

moabb.datasets.bids_interface.get_bids_root

Return the BIDS root path.

Notes

Added in version 1.5.

data_path(subject, path=None, force_update=False, update_path=None, verbose=None)[source]#

Get path to local copy of a subject data.

Parameters:
  • subject (int) – Number of subject to use

  • path (None | str) – Location of where to look for the data storing location. If None, the environment variable or config parameter MNE_DATASETS_(dataset)_PATH is used. If it doesn’t exist, the β€œ~/mne_data” directory is used. If the dataset is not found under the given path, the data will be automatically downloaded to the specified folder.

  • force_update (bool) – Force update of the dataset even if a local copy exists.

  • update_path (bool | None Deprecated) – If True, set the MNE_DATASETS_(dataset)_PATH in mne-python config to the given path. If None, the user is prompted.

  • verbose (bool, str, int, or None) – If not None, override default verbose level (see mne.verbose()).

Returns:

path – Local path to the given data file. This path is contained inside a list of length one, for compatibility.

Return type:

list of str

download(subject_list=None, path=None, force_update=False, update_path=None, accept=False, verbose=None)[source]#

Download all data from the dataset.

This function is only useful to download all the dataset at once.

Parameters:
  • subject_list (list of int | None) – List of subjects id to download, if None all subjects are downloaded.

  • path (None | str) – Location of where to look for the data storing location. If None, the environment variable or config parameter MNE_DATASETS_(dataset)_PATH is used. If it doesn’t exist, the β€œ~/mne_data” directory is used. If the dataset is not found under the given path, the data will be automatically downloaded to the specified folder.

  • force_update (bool) – Force update of the dataset even if a local copy exists.

  • update_path (bool | None) – If True, set the MNE_DATASETS_(dataset)_PATH in mne-python config to the given path. If None, the user is prompted.

  • accept (bool) – Accept licence term to download the data, if any. Default: False

  • verbose (bool, str, int, or None) – If not None, override default verbose level (see mne.verbose()).

get_additional_metadata(subject: str, session: str, run: str) None | DataFrame[source]#

Load additional metadata for a specific subject, session, and run.

This method is intended to be overridden by subclasses to provide additional metadata specific to the dataset. The metadata is typically loaded from an events.tsv file or similar data source.

Parameters:
  • subject (str) – The identifier for the subject.

  • session (str) – The identifier for the session.

  • run (str) – The identifier for the run.

Returns:

A DataFrame containing the additional metadata if available, otherwise None.

Return type:

None | pd.DataFrame

get_block_repetition(paradigm, subjects, block_list, repetition_list)[source]#

Select data for all provided subjects, blocks and repetitions.

subject -> session -> run -> block -> repetition

See also

BaseDataset.get_data

Parameters:
  • subjects (List of int) – List of subject number

  • block_list (List of int) – List of block number

  • repetition_list (List of int) – List of repetition number inside a block

Returns:

data – dict containing the raw data

Return type:

Dict

get_data(subjects=None, cache_config=None, process_pipeline=None)[source]#

Return the data corresponding to a list of subjects.

The returned data is a dictionary with the following structure:

data = {'subject_id' :
            {'session_id':
                {'run_id': run}
            }
        }

subjects are on top, then we have sessions, then runs. A sessions is a recording done in a single day, without removing the EEG cap. A session is constitued of at least one run. A run is a single contiguous recording. Some dataset break session in multiple runs.

Processing steps can optionally be applied to the data using the *_pipeline arguments. These pipelines are applied in the following order: raw_pipeline -> epochs_pipeline -> array_pipeline. If a *_pipeline argument is None, the step will be skipped. Therefore, the array_pipeline may either receive a mne.io.Raw or a mne.Epochs object as input depending on whether epochs_pipeline is None or not.

Parameters:
  • subjects (List of int) – List of subject number

  • cache_config (dict | CacheConfig) – Configuration for caching of datasets. See CacheConfig for details.

  • process_pipeline (Pipeline | None) – Optional processing pipeline to apply to the data. To generate an adequate pipeline, we recommend using moabb.utils.make_process_pipelines(). This pipeline will receive mne.io.BaseRaw objects. The steps names of this pipeline should be elements of StepType. According to their name, the steps should either return a mne.io.BaseRaw, a mne.Epochs, or a numpy.ndarray(). This pipeline must be β€œfixed” because it will not be trained, i.e. no call to fit will be made.

Returns:

data – dict containing the raw data

Return type:

Dict

get_trial_info(subjects=None)[source]#

Return trial-length metadata for the requested subjects.

Loads only the TrialData metadata from the .mat files (without building full MNE Raw objects) and summarises trial durations for artifact-free trials.

Parameters:

subjects (list of int or None) – Subjects to query. Defaults to all selected subjects (self.subject_list).

Returns:

info – Nested dict {subject_id: {session_id: {...}}} where each innermost dict contains:

  • triallengths : np.ndarray – lengths of artifact-free trials

  • n_total : int – total number of trials

  • n_artifact_free : int – trials without artifacts

  • min : float – shortest artifact-free trial

  • max : float – longest artifact-free trial

  • median : float – median artifact-free trial length

Return type:

dict

property metadata: DatasetMetadata | None[source]#

Return structured metadata for this dataset.

Returns the DatasetMetadata object from the centralized catalog, or None if metadata is not available for this dataset.

Returns:

The metadata object containing acquisition parameters, participant demographics, experiment details, and documentation. Returns None if no metadata is registered for this dataset.

Return type:

DatasetMetadata | None

Examples

>>> from moabb.datasets import BNCI2014_001
>>> dataset = BNCI2014_001()
>>> dataset.metadata.participants.n_subjects
9
>>> dataset.metadata.acquisition.sampling_rate
250.0
suggest_interval(subjects=None, keep_ratio=1.0)[source]#

Suggest an epoch interval that retains a given fraction of trials.

Parameters:
  • subjects (list of int or None) – Subjects to consider. Defaults to all selected subjects.

  • keep_ratio (float) – Fraction of artifact-free trials to retain (between 0 and 1). For example, keep_ratio=0.95 returns an interval whose tmax equals the 5th percentile of trial lengths, so that at least 95 % of artifact-free trials are long enough.

Returns:

interval – [tmin, tmax] where tmin is the current self.interval[0] and tmax is chosen to satisfy keep_ratio.

Return type:

list of float

Examples

>>> ds = Stieger2021(subjects=[1, 2, 3])
>>> ds.suggest_interval(keep_ratio=0.95)  
[0, 2.1]