cedalion.io.bids

Functions

check_coord_files(bids_dir)

Checks for and updates *_coordsystem.json files in a BIDS directory.

check_for_bids_field(path_parts, field)

@author: lauracarlton.

copy_rename_snirf(row, dataset_path, bids_dir)

Copies a .snirf file from the source directory, renaming it according to BIDS standards, and places it in the appropriate destination directory.

create_bids_standard_filenames(row)

Generates a BIDS compliant file name and its parent directory path based on the information in the given row.

create_data_description(dataset_path, bids_dir)

Creates or updates the dataset_description.json file in the specified BIDS directory.

create_participants_files(bids_dir[, ...])

Creates or updates participants.tsv and participants.json files in a BIDS-compliant directory.

create_participants_json(bids_dir[, fields])

Creates or updates a participants.json file in a BIDS-compliant directory.

create_participants_tsv(bids_dir, mapping_df)

Creates a participants.tsv file in a BIDS-compliant directory.

create_scan_files(group_df, bids_dir)

Creates a _scans.tsv file for each subject (and session, if provided) from the provided DataFrame and saves it in the BIDS directory.

create_session_files(group_df, bids_dir)

Creates a _sessions.tsv file for each subject from the provided DataFrame and saves it in the BIDS directory.

edit_events(row, bids_dir)

Edits an events.tsv file in a BIDS directory based on specified conditions.

find_files_with_pattern(start_dir, pattern)

Recursively finds all files in the specified directory (and subdirectories) that match the given pattern.

get_snirf2bids_mapping_csv(dataset_path)

@author: lauracarlton.

read_events_from_tsv(fname)

save_source(dataset_path, destination_path)

Copies the dataset to a 'sourcedata' folder within the specified destination path.

search_for_acq_time_in_scan_files(dataset_path)

Searches for _scans.tsv files in the given dataset path, reads them into DataFrames, and processes them to extract the filename and acq_time columns.

search_for_acq_time_in_snirf_files(row, ...)

Extracts acquisition time from SNIRF files if missing in the _scans.tsv file.

search_for_sessions_acq_time(dataset_path)

Searches for _sessions.tsv files in the provided dataset path, reads them into DataFrames, and processes them to extract the session_id, sub (subject ID), and ses_acq_time (session acquisition time).

sort_events(row, bids_dir)

Sorts the events in a BIDS-compatible .tsv file by onset time.

cedalion.io.bids.read_events_from_tsv(fname: str | Path)[source]
cedalion.io.bids.check_for_bids_field(path_parts: list, field: str)[source]

@author: lauracarlton.

cedalion.io.bids.get_snirf2bids_mapping_csv(dataset_path)[source]

@author: lauracarlton.

cedalion.io.bids.find_files_with_pattern(
start_dir: str | Path,
pattern: str,
) List[str][source]

Recursively finds all files in the specified directory (and subdirectories) that match the given pattern.

Parameters:

start_dirstr | Path

The directory to start the search from.

patternstr

The pattern to match filenames against.

Returns:

List[str]

A list of file paths (as strings) of all files that match the pattern.

cedalion.io.bids.create_bids_standard_filenames(
row: Series,
) Tuple[str, str][source]

Generates a BIDS compliant file name and its parent directory path based on the information in the given row.

This function constructs a filename and directory path following the BIDS naming convention for a specific subject, session, task, acquisition, and run based on the provided DataFrame row. The final filename will include “_nirs.snirf” as the extension, and the directory path will be created under a “nirs” directory.

Parameters:

rowpd.Series

A row of a Pandas DataFrame containing the following potential columns: - “sub” : The subject identifier (e.g., “01”) - “ses” : The session identifier (e.g., “01”), can be NaN if not available - “task” : The task name or identifier (e.g., “rest”) - “acq” : The acquisition identifier (e.g., “01”), can be NaN if not available - “run” : The run identifier (e.g., “1”), can be NaN if not available

Returns:

Tuple[str, str]

A tuple containing: 1. The generated filename string based on the BIDS standard. 2. The parent directory path where the file is expected to be located.

cedalion.io.bids.copy_rename_snirf(row: Series, dataset_path: str, bids_dir: str)[source]

Copies a .snirf file from the source directory, renaming it according to BIDS standards, and places it in the appropriate destination directory.

This function takes the source file (in the dataset_path), renames it based on the information in the provided row, and copies it to the target bids_dir directory, following the BIDS directory structure.

Parameters:

rowpd.Series

A row from a Pandas DataFrame containing the following columns: - “current_name” : The current name of the file (without the .snirf extension). - “parent_path” : The relative path within the BIDS structure where the file should be stored. - “bids_name” : The new BIDS-compliant name for the file.

dataset_pathstr

The path to the directory containing the original .snirf file(s) to be copied.

bids_dirstr

The path to the root BIDS directory where the renamed file should be copied to.

cedalion.io.bids.search_for_acq_time_in_scan_files(dataset_path: str) DataFrame[source]

Searches for _scans.tsv files in the given dataset path, reads them into DataFrames, and processes them to extract the filename and acq_time columns.

This function looks for all _scans.tsv files in the dataset_path, reads them into a DataFrame, and processes the filename and acq_time columns. If the acq_time column does not exist in the merged DataFrame, it will be added with None values. If no _scans.tsv files are found, an empty DataFrame with the columns filename_org and acq_time is returned.

Parameters:

dataset_pathstr

The path to the dataset where the _scans.tsv files are located.

Returns:

pd.DataFrame

A DataFrame with the following columns: - filename_org: The original filename (without the .snirf extension) from the _scans.tsv files. - acq_time: The acquisition time for each scan. If the acq_time column does not exist in the original files, it will be filled with None.

cedalion.io.bids.search_for_acq_time_in_snirf_files(
row: Series,
dataset_path: str,
) datetime[source]

Extracts acquisition time from SNIRF files if missing in the _scans.tsv file.

This function checks if the acquisition time (acq_time) is missing (NaN) in the input row. If missing, it loads the corresponding SNIRF file, extracts the acquisition date and time, and returns it as a datetime object.

Parameters:

rowpd.Series

A row from the mapping_df DataFrame containing current_name and acq_time columns.

dataset_pathstr

Path to the dataset where the SNIRF files are located.

Returns:

datetime

The acquisition timestamp extracted from the SNIRF file, or the existing acq_time if not missing.

cedalion.io.bids.search_for_sessions_acq_time(dataset_path: str) DataFrame[source]

Searches for _sessions.tsv files in the provided dataset path, reads them into DataFrames, and processes them to extract the session_id, sub (subject ID), and ses_acq_time (session acquisition time).

This function looks for all _sessions.tsv files in the given dataset_path, reads them into DataFrames, and processes them to extract the subject ID and session acquisition time. If the acq_time column does not exist in the input files, it will be added with None values. Additionally, it extracts the subject ID from the filename using a regular expression.

Parameters:

dataset_pathstr

The path to the dataset where the _sessions.tsv files are located.

Returns:

pd.DataFrame

A DataFrame with the following columns: - session_id: The session identifier (extracted from the filenames). - sub: The subject ID extracted from the filename. - ses_acq_time: The session acquisition time for each session. If the acq_time column does not exist in the original files, it will be filled with None.

cedalion.io.bids.create_scan_files(group_df: DataFrame, bids_dir: str) None[source]

Creates a _scans.tsv file for each subject (and session, if provided) from the provided DataFrame and saves it in the BIDS directory.

This function generates a _scans.tsv file for each group of data (grouped by subject and session) in the group_df DataFrame. The resulting file contains two columns: filename (with a relative path to the NIRS file) and acq_time (acquisition time). The function saves this file in the appropriate directory within the BIDS dataset.

Parameters:

group_dfpd.DataFrame

A DataFrame containing a group of rows for a particular subject and session. This DataFrame should include at least the bids_name and acq_time columns.

bids_dirstr

The path to the BIDS directory where the _scans.tsv file will be saved.

Returns:

None

This function does not return anything. It saves the generated _scans.tsv file to the BIDS directory.

cedalion.io.bids.create_session_files(group_df: DataFrame, bids_dir: str) None[source]

Creates a _sessions.tsv file for each subject from the provided DataFrame and saves it in the BIDS directory.

This function generates a _sessions.tsv file for each subject in the group_df DataFrame. The resulting file contains two columns: ses (session identifier) and acq_time (session acquisition time). The function saves this file in the appropriate directory within the BIDS dataset.

Parameters:

group_dfpd.DataFrame

A DataFrame containing a group of rows for a particular subject. This DataFrame should include at least the ses (session identifier) and ses_acq_time (session acquisition time) columns.

bids_dirstr

The path to the BIDS directory where the _sessions.tsv file will be saved.

Returns:

None

This function does not return anything. It saves the generated _sessions.tsv file to the BIDS directory.

cedalion.io.bids.create_data_description(
dataset_path: str,
bids_dir: str,
extra_meta_data: str | None = None,
) None[source]

Creates or updates the dataset_description.json file in the specified BIDS directory.

This function checks for an existing dataset_description.json file in the specified dataset path and updates it with relevant metadata. It also adds any additional metadata from an optional external JSON file (extra_meta_data). If some required keys are missing, it will add them with default values.

Parameters:

dataset_pathstr

The path to the dataset where the dataset_description.json file is located.

bids_dirstr

The path to the BIDS directory where the updated dataset_description.json file will be saved.

extra_meta_dataOptional[str], default=None

An optional path to a JSON file containing additional metadata to be included in the dataset_description.json. If not provided, no extra metadata will be added.

Returns:

None

This function does not return any value. It updates the dataset_description.json file in the BIDS directory.

cedalion.io.bids.check_coord_files(bids_dir: str) None[source]

Checks for and updates *_coordsystem.json files in a BIDS directory.

This function searches for files matching the pattern “*_coordsystem.json” within the specified BIDS directory. If the “NIRSCoordinateSystem” field is empty, it updates the field with the value “Other” and writes the updated data back to the JSON file.

Parameters:

bids_dirstr

The path to the BIDS directory where *_coordsystem.json files are located.

Returns:

None

This function does not return any value. It directly modifies the *_coordsystem.json files.

cedalion.io.bids.create_participants_tsv(
bids_dir: str,
mapping_df: DataFrame,
fields: List[str] | None = None,
) None[source]

Creates a participants.tsv file in a BIDS-compliant directory.

This function generates a participants.tsv file based on the provided mapping_df, which must include at least a “sub” column (subject identifier). It ensures that the specified fields are present in the output, initializing any missing fields with None.

Parameters:
  • bids_dir (str) – Path to the BIDS directory.

  • mapping_df (pd.DataFrame) – A DataFrame containing subject metadata, including a “sub” column.

  • fields (List[str], optional) – A list of additional participant-level fields to include in the TSV. Defaults to [“species”, “age”, “sex”, “handedness”].

Returns:

Writes participants.tsv to the specified BIDS directory.

Return type:

None

cedalion.io.bids.create_participants_json(bids_dir: str, fields: List[str] | None = None) None[source]

Creates or updates a participants.json file in a BIDS-compliant directory.

If no custom fields are provided, this function uses a default schema based on BIDS recommendations. The output describes participant-level metadata for each field in the corresponding participants.tsv file.

Parameters:
  • bids_dir (str) – Path to the BIDS directory.

  • fields (List[str], optional) – List of fields to include in the JSON schema. If None, a default set is used.

Returns:

Writes participants.json to the specified BIDS directory.

Return type:

None

cedalion.io.bids.create_participants_files(
bids_dir: str,
mapping_df: DataFrame | None = None,
participants_tsv_path: str | None = None,
participants_json_path: str | None = None,
fields: List[str] | None = None,
)[source]

Creates or updates participants.tsv and participants.json files in a BIDS-compliant directory.

If a participants.tsv file already exists and contains data, it is cleaned and standardized: - Ensures the first column is named participant_id - Prepends “sub-” to subject IDs if missing - Sorts the participants by ID The corresponding participants.json file is also updated or created based on the TSV’s columns.

If no valid participants.tsv file is found, the function will fall back to generating new files using the provided mapping_df.

Parameters:
  • bids_dir (str) – Path to the BIDS directory where output files will be written.

  • mapping_df (pd.DataFrame, optional) – Used to create participants.tsv if no existing file is found.

  • participants_tsv_path (str, optional) – Path to an existing participants.tsv file.

  • participants_json_path (str, optional) – Path to an existing participants.json file.

  • fields (List[str], optional) – List of fields to include in your schema. If None, a default set is used.

cedalion.io.bids.edit_events(row: Series, bids_dir: str) None[source]

Edits an events.tsv file in a BIDS directory based on specified conditions.

This function modifies an events.tsv file corresponding to a specific row of mapping scv. It updates the “duration” and/or “trial_type” columns based on the values in the row parameter.

Parameters:
  • row (pd.Series) – A pandas Series containing the following keys: - “cond” (str or None): A string representing a list of keys for mapping trial types. - “cond_match” (str or None): A string representing a list of values for mapping trial types. - “duration” (float or None): The duration to update in the events file. - “bids_name” (str): The base name of the BIDS file to find the corresponding events.tsv file. - “parent_path” (str): The relative path to the directory containing the events.tsv file.

  • bids_dir (str) – The base directory for the BIDS dataset.

Returns:

The function modifies the events.tsv file in place and does not return a value.

Return type:

None

cedalion.io.bids.sort_events(row: Series, bids_dir: str) None[source]

Sorts the events in a BIDS-compatible .tsv file by onset time.

This function locates the corresponding _events.tsv file for a given dataset, reads the file, sorts the events by the “onset” column, and overwrites the original file with the sorted version.

Parameters:

rowpd.Series

A row from a DataFrame containing BIDS file metadata. Must include the keys “bids_name” and “parent_path”.

bids_dirstr

The root directory of the BIDS dataset.

cedalion.io.bids.save_source(dataset_path: str, destination_path: str) None[source]

Copies the dataset to a ‘sourcedata’ folder within the specified destination path.

If a ‘sourcedata’ folder already exists inside the dataset_path, only that folder is copied. Otherwise, the entire dataset is copied into a new ‘sourcedata’ folder in the destination.

Parameters:
  • dataset_path (str) – Path to the original dataset.

  • destination_path (str) – Directory where the ‘sourcedata’ folder will be created and data copied.