cedalion.io.bids

Utilities for converting fNIRS datasets to the BIDS standard.

Provides functions to organise raw SNIRF files into a BIDS directory tree, generate BIDS-compliant filenames, create required sidecar files (dataset_description.json, participants.tsv/.json, _scans.tsv, _sessions.tsv), and read/write optode positions in the BIDS _optodes.tsv / _coordsystem.json format.

References

Gorgolewski et al. [GAC+16], Luke et al. [LOC+25]

Functions

check_coord_files(bids_dir)

Checks for and updates *_coordsystem.json files in a BIDS directory.

check_for_bids_field(path_parts, field)

@author: lauracarlton.

copy_rename_snirf(row, dataset_path, bids_dir)

Copies and renames a .snirf into the appropriate destination directory.

create_bids_standard_filenames(row)

Generates a BIDS compliant file name and its parent directory path.

create_data_description(dataset_path, bids_dir)

Creates or updates dataset_description.json in the BIDS directory.

create_participants_files(bids_dir[, ...])

Creates or updates the BIDS participants.tsv and participants.json files.

create_participants_json(bids_dir[, fields])

Creates or updates a participants.json file in a BIDS-compliant directory.

create_participants_tsv(bids_dir, mapping_df)

Creates a participants.tsv file in a BIDS-compliant directory.

create_scan_files(group_df, bids_dir)

Creates and saves a _scans.tsv file per subject/session in the BIDS directory.

create_session_files(group_df, bids_dir)

Creates and saves a _sessions.tsv file per subject in the BIDS directory.

edit_events(row, bids_dir)

Edits a BIDS _events.tsv file in place based on values in row.

export_to_bids_optodes_tsv(tsv_filename, points)

Export to a bids-conform _optodes.tsv.

find_files_with_pattern(start_dir, pattern)

Recursively finds all files matching the given pattern.

get_snirf2bids_mapping_csv(dataset_path)

@author: lauracarlton.

load_from_bids_optodes_tsv(tsv_filename)

Load optodes and landmarks from a BIDS *_optodes.tsv and its *_coordsystem.json.

read_events_from_tsv(fname)

save_source(dataset_path, destination_path)

Copies the dataset into a sourcedata folder in destination_path.

search_for_acq_time_in_scan_files(dataset_path)

Searches for _scans.tsv files in dataset_path and extracts acquisition times.

search_for_acq_time_in_snirf_files(row, ...)

Extracts acquisition time from SNIRF files if missing in the _scans.tsv file.

search_for_sessions_acq_time(dataset_path)

Searches _sessions.tsv files in the dataset path and returns session times.

sort_events(row, bids_dir)

Sorts the events in a BIDS _events.tsv file by onset time.

cedalion.io.bids.read_events_from_tsv(fname: str | Path)[source]
cedalion.io.bids.check_for_bids_field(path_parts: list, field: str)[source]

@author: lauracarlton.

cedalion.io.bids.get_snirf2bids_mapping_csv(dataset_path)[source]

@author: lauracarlton.

cedalion.io.bids.find_files_with_pattern(
start_dir: str | Path,
pattern: str,
) List[str][source]

Recursively finds all files matching the given pattern.

Searches in the specified directory and subdirectories.

Parameters:
  • start_dir – The directory to start the search from.

  • pattern – The pattern to match filenames against.

Returns:

A list of file paths (as strings) of all files that match the pattern.

cedalion.io.bids.create_bids_standard_filenames(
row: Series,
) Tuple[str, str][source]

Generates a BIDS compliant file name and its parent directory path.

Constructs a filename and directory path following the BIDS naming convention for a specific subject, session, task, acquisition, and run. The filename includes _nirs.snirf as the extension and the directory is nested under a nirs subdirectory.

Parameters:

row

A row of a Pandas DataFrame with the following columns:

  • "sub": The subject identifier (e.g., "01").

  • "ses": The session identifier (e.g., "01"), can be NaN.

  • "task": The task name or identifier (e.g., "rest").

  • "acq": The acquisition identifier, can be NaN.

  • "run": The run identifier (e.g., "1"), can be NaN.

Returns:

A tuple of (bids_filename, parent_directory_path).

cedalion.io.bids.copy_rename_snirf(row: Series, dataset_path: str, bids_dir: str)[source]

Copies and renames a .snirf into the appropriate destination directory.

This function takes the source file (in the dataset_path), renames it based on the information in the provided row, and copies it to the target bids_dir directory, following the BIDS directory structure.

Parameters:
  • row (pd.Series) –

    A row from a Pandas DataFrame containing the following columns:

    • "current_name": The current name of the file (without the .snirf extension).

    • "parent_path": The relative path within the BIDS structure where the file should be stored.

    • "bids_name": The new BIDS-compliant name for the file.

  • dataset_path (str) – The path to the directory containing the original .snirf file(s) to be copied.

  • bids_dir (str) – The path to the root BIDS directory where the renamed file should be copied to.

cedalion.io.bids.search_for_acq_time_in_scan_files(dataset_path: str) DataFrame[source]

Searches for _scans.tsv files in dataset_path and extracts acquisition times.

Looks for all _scans.tsv files in dataset_path, reads them into a DataFrame, and processes the filename and acq_time columns. If acq_time does not exist in the merged DataFrame, it is added with None values. If no _scans.tsv files are found, an empty DataFrame with columns filename_org and acq_time is returned.

Parameters:

dataset_path (str) – The path to the dataset where the _scans.tsv files are located.

Returns:

A DataFrame with the following columns:

  • "filename_org": The original filename (without the .snirf extension) from the _scans.tsv files.

  • "acq_time": The acquisition time for each scan, or None if the column does not exist in the original files.

Return type:

pd.DataFrame

cedalion.io.bids.search_for_acq_time_in_snirf_files(
row: Series,
dataset_path: str,
) datetime[source]

Extracts acquisition time from SNIRF files if missing in the _scans.tsv file.

Checks if acq_time is NaN in the input row. If missing, loads the corresponding SNIRF file, extracts the measurement date and time, and returns it as an ISO 8601 timestamp string.

Parameters:
  • row – A row from mapping_df containing current_name and acq_time columns.

  • dataset_path – Path to the dataset where the SNIRF files are located.

Returns:

The acquisition timestamp extracted from the SNIRF file, or the existing acq_time value if it is not missing.

cedalion.io.bids.search_for_sessions_acq_time(dataset_path: str) DataFrame[source]

Searches _sessions.tsv files in the dataset path and returns session times.

Looks for all _sessions.tsv files in dataset_path, reads them into DataFrames, and extracts the subject ID and session acquisition time. If acq_time does not exist in the input files, it is added with None values. Subject IDs are extracted from filenames using a regular expression.

Parameters:

dataset_path – The path to the dataset where _sessions.tsv files are located.

Returns:

  • "ses": The session identifier (extracted from the filenames).

  • "sub": The subject ID extracted from the filename.

  • "ses_acq_time": The session acquisition time, or None if acq_time does not exist in the original files.

Return type:

A DataFrame with the following columns

cedalion.io.bids.create_scan_files(group_df: DataFrame, bids_dir: str) None[source]

Creates and saves a _scans.tsv file per subject/session in the BIDS directory.

Generates a _scans.tsv for each group (by subject and session) in group_df. The file contains two columns: filename (relative path to the NIRS file) and acq_time (acquisition time).

Parameters:
  • group_df – A grouped DataFrame for a particular subject and session. Must include at least the bids_name and acq_time columns.

  • bids_dir – The path to the BIDS directory where _scans.tsv will be saved.

cedalion.io.bids.create_session_files(group_df: DataFrame, bids_dir: str) None[source]

Creates and saves a _sessions.tsv file per subject in the BIDS directory.

Generates a _sessions.tsv for each subject in group_df. The file contains two columns: ses (session identifier) and acq_time (session acquisition time).

Parameters:
  • group_df – A grouped DataFrame for a particular subject. Must include at least the ses and ses_acq_time columns.

  • bids_dir – The path to the BIDS directory where _sessions.tsv will be saved.

cedalion.io.bids.create_data_description(
dataset_path: str,
bids_dir: str,
extra_meta_data: str | None = None,
) None[source]

Creates or updates dataset_description.json in the BIDS directory.

Checks for an existing dataset_description.json in dataset_path and updates it with relevant metadata. Additional metadata from extra_meta_data is merged in if provided. Missing required keys are filled with default values.

Parameters:
  • dataset_path – The path to the dataset where dataset_description.json is located.

  • bids_dir – The path to the BIDS directory where the updated dataset_description.json will be saved.

  • extra_meta_data – Path to a JSON file with additional metadata to include. Defaults to None.

cedalion.io.bids.check_coord_files(bids_dir: str) None[source]

Checks for and updates *_coordsystem.json files in a BIDS directory.

Searches for files matching *_coordsystem.json in bids_dir. If NIRSCoordinateSystem is empty, it is set to "Other". Coordinate units are normalized to SI abbreviations (e.g. "millimeter""mm"); unrecognized units are set to "n/a".

Parameters:

bids_dir – The path to the BIDS directory where *_coordsystem.json files are located.

cedalion.io.bids.create_participants_tsv(
bids_dir: str,
mapping_df: DataFrame,
fields: List[str] | None = None,
) None[source]

Creates a participants.tsv file in a BIDS-compliant directory.

This function generates a participants.tsv file based on the provided mapping_df, which must include at least a “sub” column (subject identifier). It ensures that the specified fields are present in the output, initializing any missing fields with None.

Parameters:
  • bids_dir (str) – Path to the BIDS directory.

  • mapping_df (pd.DataFrame) – A DataFrame containing subject metadata, including a “sub” column.

  • fields (List[str], optional) – A list of additional participant-level fields to include in the TSV. Defaults to [“species”, “age”, “sex”, “handedness”].

Returns:

Writes participants.tsv to the specified BIDS directory.

Return type:

None

cedalion.io.bids.create_participants_json(bids_dir: str, fields: List[str] | None = None) None[source]

Creates or updates a participants.json file in a BIDS-compliant directory.

If no custom fields are provided, this function uses a default schema based on BIDS recommendations. The output describes participant-level metadata for each field in the corresponding participants.tsv file.

Parameters:
  • bids_dir (str) – Path to the BIDS directory.

  • fields (List[str], optional) – List of fields to include in the JSON schema. If None, a default set is used.

Returns:

Writes participants.json to the specified BIDS directory.

Return type:

None

cedalion.io.bids.create_participants_files(
bids_dir: str,
mapping_df: DataFrame | None = None,
participants_tsv_path: str | None = None,
participants_json_path: str | None = None,
fields: List[str] | None = None,
)[source]

Creates or updates the BIDS participants.tsv and participants.json files.

If a participants.tsv file already exists and contains data, it is cleaned and standardized:

  • Ensures the first column is named participant_id.

  • Prepends "sub-" to subject IDs if missing.

  • Sorts participants by ID.

The corresponding participants.json is also updated or created based on the TSV’s columns. If no valid participants.tsv is found, falls back to generating new files from mapping_df.

Parameters:
  • bids_dir – Path to the BIDS directory where output files will be written.

  • mapping_df – Used to create participants.tsv if no existing file is found.

  • participants_tsv_path – Path to an existing participants.tsv file.

  • participants_json_path – Path to an existing participants.json file.

  • fields – Fields to include in the schema. If None, a default set is used.

cedalion.io.bids.edit_events(row: Series, bids_dir: str) None[source]

Edits a BIDS _events.tsv file in place based on values in row.

Updates the "duration" and/or "trial_type" columns of the corresponding _events.tsv file.

Parameters:
  • row

    A row from the mapping DataFrame with the following keys:

    • "cond": Serialised list of keys for mapping trial types, or None.

    • "cond_match": Serialised list of replacement values, or None.

    • "duration": Duration to write into every event, or None.

    • "bids_name": Base name of the BIDS file used to locate the _events.tsv.

    • "parent_path": Relative path to the directory containing the _events.tsv.

  • bids_dir – The root directory of the BIDS dataset.

cedalion.io.bids.sort_events(row: Series, bids_dir: str) None[source]

Sorts the events in a BIDS _events.tsv file by onset time.

Locates the corresponding _events.tsv file for the given row, reads it, sorts events by the "onset" column, and overwrites the original file.

Parameters:
  • row – A row from a BIDS file metadata DataFrame. Must include "bids_name" and "parent_path" keys.

  • bids_dir – The root directory of the BIDS dataset.

cedalion.io.bids.save_source(dataset_path: str, destination_path: str) None[source]

Copies the dataset into a sourcedata folder in destination_path.

If a sourcedata subfolder already exists inside dataset_path, only that subfolder is copied. Otherwise the entire dataset is copied.

Parameters:
  • dataset_path – Path to the original dataset.

  • destination_path – Directory where the sourcedata folder will be created.

cedalion.io.bids.export_to_bids_optodes_tsv(
tsv_filename,
points: Annotated[DataArray, DataArraySchema(dims='label', coords='label', 'label', 'type')],
units='mm',
float_format: str | None = None,
)[source]

Export to a bids-conform _optodes.tsv.

Parameters:
  • tsv_filename – Path to the output tsv file.

  • points – LabeledPoints to save.

  • units – coordinate units.

  • float_format – Format string for floating point numbers.

cedalion.io.bids.load_from_bids_optodes_tsv(
tsv_filename: Path | str,
) Annotated[DataArray, DataArraySchema(dims='label', coords='label', 'label', 'type')][source]

Load optodes and landmarks from a BIDS *_optodes.tsv and its *_coordsystem.json.

The coordinate system name, units, and anatomical landmarks are read from the accompanying *_coordsystem.json file. The JSON is expected at the same path as the TSV, with _optodes.tsv replaced by _coordsystem.json.

Parameters:

tsv_filename – Path to the BIDS *_optodes.tsv file.

Returns:

LabeledPoints with sources, detectors, and (if present) landmarks.