{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Convert a fNIRS dataset to BIDS\n", "\n", " Do not run all cells at once. Carefully read the comments before each code cell — some steps require you to manually modify certain files (e.g., the mapping CSV) before proceeding.
Make sure all required edits are completed before continuing to the next step.
" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:49:55.648565Z", "iopub.status.busy": "2025-11-11T09:49:55.648337Z", "iopub.status.idle": "2025-11-11T09:49:55.656469Z", "shell.execute_reply": "2025-11-11T09:49:55.655762Z" } }, "outputs": [], "source": [ "# This cells setups the environment when executed in Google Colab.\n", "try:\n", " import google.colab\n", " !curl -s https://raw.githubusercontent.com/ibs-lab/cedalion/dev/scripts/colab_setup.py -o colab_setup.py\n", " # Select branch with --branch \"branch name\" (default is \"dev\")\n", " %run colab_setup.py --branch \"dev\"\n", "except ImportError:\n", " pass" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:49:55.658186Z", "iopub.status.busy": "2025-11-11T09:49:55.657978Z", "iopub.status.idle": "2025-11-11T09:49:57.310920Z", "shell.execute_reply": "2025-11-11T09:49:57.310074Z" } }, "outputs": [], "source": [ "import os\n", "import re\n", "import shutil\n", "from pathlib import Path\n", "from tempfile import TemporaryDirectory\n", "\n", "import pandas as pd\n", "import snirf2bids as s2b\n", "from seedir import seedir\n", "from rich import print_json\n", "\n", "from cedalion.data import get_snirf2bids_example_dataset\n", "from cedalion.io import bids\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Convert your own dataset or an example\n", "\n", "When the constant `DEMO_MODE` is set to `True`, an example dataset is used. Set it to `False` and modify the notebook variables as described below to convert a different dataset." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:49:57.313706Z", "iopub.status.busy": "2025-11-11T09:49:57.313305Z", "iopub.status.idle": "2025-11-11T09:49:57.316355Z", "shell.execute_reply": "2025-11-11T09:49:57.315580Z" } }, "outputs": [], "source": [ "DEMO_MODE = True" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Provide file paths and meta data\n", "\n", "This notebook shows how to convert an fNIRS dataset into BIDS format. To use it, provide the following inputs:\n", "\n", "1. Dataset Path: Folder containing the raw dataset.\n", "2. Destination Path: Folder where the BIDS-compliant dataset will be saved.\n", "3. Mapping CSV File: CSV file that defines the dataset structure and provides necessary details for BIDS conversion.\n", "4. (Optional) extra_meta_data File: Additional metadata to include in the description.json file. You can use [this google form](https://docs.google.com/forms/d/e/1FAIpQLSeZjlgIqCwp054HsHmTBKPziqcOlfTcaWpdXcGFYPDf0Q5vNg/viewform?usp=sf_link) or [this website](https://neurojson.org/Create/dataset_description_fnirs) to create this file.\n", "5. (Optional) participants.tsv / participants.json files. If you already have a participants.tsv/.json file and provide the link below, it will be used directly.\n", "Alternatively, if you have participant-level metadata saved in a CSV or Excel file, with the first column as the participant ID and the remaining columns as metadata (with appropriate headers) and you provide the link to it below, the script will convert it into properly formatted .tsv and .json files for BIDS." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:49:57.318128Z", "iopub.status.busy": "2025-11-11T09:49:57.317904Z", "iopub.status.idle": "2025-11-11T09:49:58.653193Z", "shell.execute_reply": "2025-11-11T09:49:58.652375Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Downloading file 'snirf2bids_example_dataset.zip' from 'https://doc.ibs.tu-berlin.de/cedalion/datasets/dev/snirf2bids_example_dataset.zip' to '/home/runner/.cache/cedalion/dev'.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Unzipping contents of '/home/runner/.cache/cedalion/dev/snirf2bids_example_dataset.zip' to '/home/runner/.cache/cedalion/dev/snirf2bids_example_dataset.zip.unzip'\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "dataset_path : /home/runner/.cache/cedalion/dev/snirf2bids_example_dataset.zip.unzip/snirf2bids_example_dataset\n", "destination_path: /tmp/tmpahpn2f41\n", "\n", "snirf2bids_example_dataset/\n", "├─02262024_1100_473/\n", "│ └─2024-02-26_010/\n", "│ ├─2024-02-26_014_config.json\n", "│ ├─2024-02-26_014.snirf\n", "│ ├─2024-02-26_014_lsl.tri\n", "│ ├─2024-02-26_014_config.hdr\n", "│ ├─2024-02-26_014_calibration.json\n", "│ ├─2024-02-26_014_probeInfo.mat\n", "│ ├─2024-02-26_014_description.json\n", "│ ├─2024-02-26_014.wl2\n", "│ ├─2024-02-26_014.wl1\n", "│ └─digpts.txt\n", "├─snirf2BIDS_mapping_edited.csv\n", "├─02272024_1030_474/\n", "│ └─2024-02-27_010/\n", "│ ├─2024-02-27_010_config.json\n", "│ ├─2024-02-27_010_config.hdr\n", "│ ├─2024-02-27_010_calibration.json\n", "│ ├─2024-02-27_010.wl1\n", "│ ├─2024-02-27_010_probeInfo.mat\n", "│ ├─2024-02-27_010.snirf\n", "│ ├─2024-02-27_010_description.json\n", "│ ├─2024-02-27_010.wl2\n", "│ ├─2024-02-27_010_lsl.tri\n", "│ └─digpts.txt\n", "└─readme.txt\n" ] } ], "source": [ "if DEMO_MODE:\n", " dataset_path, edited_mapping_df_path = get_snirf2bids_example_dataset()\n", "\n", " temporary_directory = TemporaryDirectory()\n", " destination_path = Path(temporary_directory.name)\n", "\n", " print(f\"dataset_path : {dataset_path}\\ndestination_path: {destination_path}\\n\")\n", " seedir(dataset_path)\n", "\n", "else:\n", " dataset_path = Path('path-to-your-dataset-folder') # REQUIRED\n", " destination_path = Path('path-to-your-destination-bids-folder') # REQUIRED" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:49:58.655312Z", "iopub.status.busy": "2025-11-11T09:49:58.655157Z", "iopub.status.idle": "2025-11-11T09:49:58.667096Z", "shell.execute_reply": "2025-11-11T09:49:58.666288Z" } }, "outputs": [ { "data": { "text/plain": [ "'/home/runner/.cache/cedalion/dev/snirf2bids_example_dataset.zip.unzip/snirf2bids_example_dataset/snirf2BIDS_mapping.csv'" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "extra_meta_data_path = Path('path-to-your-meta-data') # OPTIONAL\n", "extra_meta_data_path = extra_meta_data_path if extra_meta_data_path.exists() else None\n", "\n", "mapping_df_path = bids.get_snirf2bids_mapping_csv(dataset_path)\n", "display(mapping_df_path)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:49:58.669012Z", "iopub.status.busy": "2025-11-11T09:49:58.668851Z", "iopub.status.idle": "2025-11-11T09:49:58.671829Z", "shell.execute_reply": "2025-11-11T09:49:58.671038Z" } }, "outputs": [], "source": [ "participants_tsv_file = Path('path-to-your-participants.tsv') # OPTIONAL\n", "participants_json_file = Path('path-to-your-participants.json') # OPTIONAL" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Please modify the mapping CSV file which is automatically created under you raw dataset folder.\n", "\n", "By default, a mapping CSV file is generated under the main raw dataset folder using the get_snirf2bids_mapping_csv function.\n", "Before running the rest of the code, open this file, make any necessary edits, and save it. A valid mapping CSV must include all SNIRF files in your dataset, along with the following columns:\n", "\n", "- sub: Participant identifier\n", "- ses (optional): Session identifier\n", "- task: Task name or label\n", "- run (optional): Run number\n", "- acq (optional): Acquisition label\n", "- cond (optional): List of condition labels\n", "- cond_match (optional): List of matching condition values\n", "- duration (optional): Event duration in seconds\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:49:58.673427Z", "iopub.status.busy": "2025-11-11T09:49:58.673261Z", "iopub.status.idle": "2025-11-11T09:49:58.683908Z", "shell.execute_reply": "2025-11-11T09:49:58.683099Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
current_namesubsestaskrunacqcondcond_matchduration
002262024_1100_473/2024-02-26_010/2024-02-26_014473NaNballsqueezingNaNNaNNaNNaNNaN
102272024_1030_474/2024-02-27_010/2024-02-27_010474NaNballsqueezingNaNNaNNaNNaNNaN
\n", "
" ], "text/plain": [ " current_name sub ses task \\\n", "0 02262024_1100_473/2024-02-26_010/2024-02-26_014 473 NaN ballsqueezing \n", "1 02272024_1030_474/2024-02-27_010/2024-02-27_010 474 NaN ballsqueezing \n", "\n", " run acq cond cond_match duration \n", "0 NaN NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN NaN " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "if DEMO_MODE:\n", " # simulate user edits by replacing mapping_df_path with a prefilled one\n", " shutil.copy(edited_mapping_df_path, mapping_df_path)\n", "\n", "mapping_df = pd.read_csv(mapping_df_path, dtype=str)\n", "mapping_df.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The mapping table created below serves as a key component for organizing and processing your dataset. The `ses`, `run`, and `acq` columns are optional and can be set to None if not applicable. The `current_name` column contains the path to the SNIRF files in your dataset." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Looking for possible *_scan.tsv files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To ensure no important information (e.g., acquisition time) from the original dataset is lost, we will:\n", "\n", "- Search Subdirectories: Traverse through all subdirectories within the dataset.\n", "- Locate Existing Scan Files: Search for all *_scan.tsv files in the dataset.\n", "- Integrate into Mapping Table: Extract the relevant information from these files and add it to our mapping table.\n", "- Extracts acquisition time from SNIRF files if missing in the `_scans.tsv` file.\n", "\n", "This approach ensures that any details, such as acquisition time, are retained and incorporated into the BIDS-compliant structure." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:49:58.685613Z", "iopub.status.busy": "2025-11-11T09:49:58.685453Z", "iopub.status.idle": "2025-11-11T09:49:58.893069Z", "shell.execute_reply": "2025-11-11T09:49:58.892237Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
current_namesubsestaskrunacqcondcond_matchdurationfilename_orgacq_time
002262024_1100_473/2024-02-26_010/2024-02-26_014473NaNballsqueezingNaNNaNNaNNaNNaN2024-02-26_0142024-02-26 12:09:58
102272024_1030_474/2024-02-27_010/2024-02-27_010474NaNballsqueezingNaNNaNNaNNaNNaN2024-02-27_0102024-02-27 11:37:36
\n", "
" ], "text/plain": [ " current_name sub ses task \\\n", "0 02262024_1100_473/2024-02-26_010/2024-02-26_014 473 NaN ballsqueezing \n", "1 02272024_1030_474/2024-02-27_010/2024-02-27_010 474 NaN ballsqueezing \n", "\n", " run acq cond cond_match duration filename_org acq_time \n", "0 NaN NaN NaN NaN NaN 2024-02-26_014 2024-02-26 12:09:58 \n", "1 NaN NaN NaN NaN NaN 2024-02-27_010 2024-02-27 11:37:36 " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mapping_df[\"filename_org\"] = mapping_df[\"current_name\"].apply(\n", " lambda x: os.path.basename(x))\n", "scan_df = bids.search_for_acq_time_in_scan_files(dataset_path)\n", "\n", "mapping_df = pd.merge(mapping_df, scan_df, on=\"filename_org\", how=\"left\")\n", "mapping_df[\"acq_time\"] = mapping_df.apply(\n", " bids.search_for_acq_time_in_snirf_files, axis=1, args=(dataset_path,)\n", ")\n", "\n", "mapping_df.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `acq_time` information is retrieved from the original dataset's *_scan.tsv files and integrated into the mapping table." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Looking for possible *_session.tsv files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similar to *_scan.tsv files, we search for *_session.tsv files in the dataset path to capture additional session-level metadata, such as acquisition times. Any relevant information from these files is added to the mapping table to ensure all session details are preserved." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:49:58.895164Z", "iopub.status.busy": "2025-11-11T09:49:58.894989Z", "iopub.status.idle": "2025-11-11T09:49:58.907546Z", "shell.execute_reply": "2025-11-11T09:49:58.906747Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
current_namesubsestaskrunacqcondcond_matchdurationfilename_orgacq_timeses_acq_time
002262024_1100_473/2024-02-26_010/2024-02-26_014473NaNballsqueezingNaNNaNNaNNaNNaN2024-02-26_0142024-02-26 12:09:58NaN
102272024_1030_474/2024-02-27_010/2024-02-27_010474NaNballsqueezingNaNNaNNaNNaNNaN2024-02-27_0102024-02-27 11:37:36NaN
\n", "
" ], "text/plain": [ " current_name sub ses task \\\n", "0 02262024_1100_473/2024-02-26_010/2024-02-26_014 473 NaN ballsqueezing \n", "1 02272024_1030_474/2024-02-27_010/2024-02-27_010 474 NaN ballsqueezing \n", "\n", " run acq cond cond_match duration filename_org acq_time \\\n", "0 NaN NaN NaN NaN NaN 2024-02-26_014 2024-02-26 12:09:58 \n", "1 NaN NaN NaN NaN NaN 2024-02-27_010 2024-02-27 11:37:36 \n", "\n", " ses_acq_time \n", "0 NaN \n", "1 NaN " ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "session_df = bids.search_for_sessions_acq_time(dataset_path)\n", "mapping_df = pd.merge(mapping_df, session_df, on=[\"sub\", \"ses\"], how=\"left\")\n", "\n", "mapping_df.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Converting the dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create BIDS Folder Structure" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The goal of this section is to rename the SNIRF files according to the BIDS naming convention and place them in the appropriate directory under `destination_path`, following the BIDS folder structure.\n", "\n", "Steps:\n", "1. Generate new filenames: Create BIDS-compliant filenames for all SNIRF records.\n", "2. Determine file locations: Identify the appropriate locations for these files within the BIDS folder hierarchy.\n", "\n", "This process ensures that the dataset adheres to BIDS standards for organization and naming." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:49:58.909536Z", "iopub.status.busy": "2025-11-11T09:49:58.909288Z", "iopub.status.idle": "2025-11-11T09:49:58.923873Z", "shell.execute_reply": "2025-11-11T09:49:58.922904Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
current_namesubsestaskrunacqcondcond_matchdurationfilename_orgacq_timeses_acq_timebids_nameparent_path
002262024_1100_473/2024-02-26_010/2024-02-26_014473NaNballsqueezingNaNNaNNaNNaNNaN2024-02-26_0142024-02-26 12:09:58NaNsub-473_task-ballsqueezing_nirs.snirfsub-473/nirs
102272024_1030_474/2024-02-27_010/2024-02-27_010474NaNballsqueezingNaNNaNNaNNaNNaN2024-02-27_0102024-02-27 11:37:36NaNsub-474_task-ballsqueezing_nirs.snirfsub-474/nirs
\n", "
" ], "text/plain": [ " current_name sub ses task \\\n", "0 02262024_1100_473/2024-02-26_010/2024-02-26_014 473 NaN ballsqueezing \n", "1 02272024_1030_474/2024-02-27_010/2024-02-27_010 474 NaN ballsqueezing \n", "\n", " run acq cond cond_match duration filename_org acq_time \\\n", "0 NaN NaN NaN NaN NaN 2024-02-26_014 2024-02-26 12:09:58 \n", "1 NaN NaN NaN NaN NaN 2024-02-27_010 2024-02-27 11:37:36 \n", "\n", " ses_acq_time bids_name parent_path \n", "0 NaN sub-473_task-ballsqueezing_nirs.snirf sub-473/nirs \n", "1 NaN sub-474_task-ballsqueezing_nirs.snirf sub-474/nirs " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mapping_df[[\"bids_name\", \"parent_path\"]] = mapping_df.apply(\n", " bids.create_bids_standard_filenames, axis=1, result_type='expand')\n", "\n", "mapping_df.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To facilitate proper organization:\n", "\n", "- `parent_path`: Added to the mapping dataframe to define the location of each SNIRF file within `destination_path`.\n", "- `bids_name`: Specifies the new BIDS-compliant name for each file.\n", "In the following sections, we will rename all files to their corresponding `bids_name` and copy them to their designated parent_path." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:49:58.925606Z", "iopub.status.busy": "2025-11-11T09:49:58.925444Z", "iopub.status.idle": "2025-11-11T09:49:58.934301Z", "shell.execute_reply": "2025-11-11T09:49:58.933366Z" } }, "outputs": [], "source": [ "_ = mapping_df.apply(bids.copy_rename_snirf, axis=1, args=(dataset_path, destination_path))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create BIDS specific files (e.g., _coordsystem.json)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this step, we utilize the snirf2bids Python package to generate the necessary .tsv and .json files for the BIDS structure.\n", "\n", "For every record, the following files will be created:\n", "1. _coordsystem.json\n", "2. _optodes.json\n", "3. _optodes.tsv\n", "4. *_channels.tsv\n", "5. *_events.json\n", "6. *_events.tsv\n", "7. *_nirs.json\n", "\n", "These files are essential for ensuring the dataset adheres to BIDS standards." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:49:58.936136Z", "iopub.status.busy": "2025-11-11T09:49:58.935971Z", "iopub.status.idle": "2025-11-11T09:50:02.127370Z", "shell.execute_reply": "2025-11-11T09:50:02.126574Z" } }, "outputs": [], "source": [ "s2b.snirf2bids_recurse(destination_path)\n", "pattern = re.compile(r'.*_scans\\.tsv$|^participants\\.tsv$|^temp_participants\\.tsv$')\n", "files_to_delete = [file for file in destination_path.rglob('*') if file.is_file() and pattern.match(file.name)]\n", "for file in files_to_delete:\n", " file.unlink()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create _scan.tsv Files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we proceed to create scan files for all subjects and sessions. Previously, we searched the original dataset path for any provided scan information, which will now be incorporated into the BIDS structure." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:50:02.129455Z", "iopub.status.busy": "2025-11-11T09:50:02.129280Z", "iopub.status.idle": "2025-11-11T09:50:02.138773Z", "shell.execute_reply": "2025-11-11T09:50:02.138069Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: []\n", "Index: []" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "scan_df = mapping_df[[\"sub\", \"ses\", \"bids_name\", \"acq_time\"]].copy()\n", "scan_df['ses'].fillna(\"Unknown\", inplace=True)\n", "scan_df = scan_df.groupby([\"sub\", \"ses\"])\n", "scan_df.apply(lambda group: bids.create_scan_files(group, destination_path))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create _session.tsv Files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next step is to create session files for all subjects. As with the scan files, we previously searched the original dataset path for any session information, which will now be used to create the corresponding BIDS session files." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:50:02.140654Z", "iopub.status.busy": "2025-11-11T09:50:02.140492Z", "iopub.status.idle": "2025-11-11T09:50:02.147272Z", "shell.execute_reply": "2025-11-11T09:50:02.146640Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: []\n", "Index: []" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "session_df = mapping_df[[\"sub\", \"ses\", \"ses_acq_time\"]]\n", "session_df = session_df.groupby([\"sub\"])\n", "session_df.apply(lambda group: bids.create_session_files(group, destination_path))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create and Integrate participants.tsv and participants.json" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this step, we gather available participant information and incorporate it into the BIDS structure. \n", "\n", "If you want to use custom participant metadata, you should provide it at the beginning of the code, either as a participants.tsv file or as a CSV/Excel file.\n", "\n", "- If you provide a participants.tsv file but not a corresponding participants.json, you should fill out the participants.json manually to include descriptions for each field to comply with BIDS standards.\n", "\n", "- If you provide neither file, new participants.tsv and participants.json files will be automatically created with standard fields:\n", "\n", " - species\n", " - age\n", " - sex\n", " - handedness\n", "\n", "You can also pass your favourite/custom fields instead of these defaults when creating new files (only applies if no valid TSV is provided)." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:50:02.149160Z", "iopub.status.busy": "2025-11-11T09:50:02.149004Z", "iopub.status.idle": "2025-11-11T09:50:02.154286Z", "shell.execute_reply": "2025-11-11T09:50:02.153643Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "No valid participants.tsv file found. Creating default files.\n" ] } ], "source": [ "saved_participants = bids.create_participants_files(bids_dir=destination_path, \n", " participants_tsv_path= participants_tsv_file, \n", " participants_json_path=participants_json_file, \n", " mapping_df=mapping_df,\n", " fields=[\"gender\", \"age\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create data description file" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To create the dataset_description.json file, we follow these steps:\n", "\n", "1. Search for an existing dataset_description.json in the dataset path and retain the provided information.\n", "2. If extra_meta_data_path is specified, add the additional metadata about the dataset.\n", "3. If neither dataset_description.json nor extra metadata is provided, use the basename of the dataset directory as the dataset name and set the BIDS version to '1.10.0'." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:50:02.156070Z", "iopub.status.busy": "2025-11-11T09:50:02.155909Z", "iopub.status.idle": "2025-11-11T09:50:02.159126Z", "shell.execute_reply": "2025-11-11T09:50:02.158418Z" } }, "outputs": [], "source": [ "bids.create_data_description(dataset_path, destination_path, extra_meta_data_path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Check _coordsystem.json file" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since an empty string is not allowed for the `NIRSCoordinateSystem` key in the *_coordsystem.json file, we will populate it with \"Other\" to ensure BIDS compliance." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:50:02.160635Z", "iopub.status.busy": "2025-11-11T09:50:02.160483Z", "iopub.status.idle": "2025-11-11T09:50:02.166897Z", "shell.execute_reply": "2025-11-11T09:50:02.166123Z" } }, "outputs": [], "source": [ "bids.check_coord_files(destination_path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Fix *_events.tsv order\n", "\n", "Sorting events files based on onset time" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:50:02.168497Z", "iopub.status.busy": "2025-11-11T09:50:02.168299Z", "iopub.status.idle": "2025-11-11T09:50:02.175789Z", "shell.execute_reply": "2025-11-11T09:50:02.175032Z" } }, "outputs": [], "source": [ "_ = mapping_df.apply(bids.sort_events, axis=1, args=(destination_path,))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Edit *_events.tsv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To allow editing of the `duration` or `trial_type` columns in the *_events.tsv files, the mapping CSV file must include the following extra columns:\n", "\n", "1. `duration`: Specifies the new duration for each SNIRF file that needs editing.\n", "2. cond and cond_match:\n", "\n", " - cond: A list of existing condition labels found in the SNIRF file (e.g., [1, 2]).\n", "\n", " - cond_match: A list of new labels you want to use in place of those conditions (e.g., [\"con\", \"inc\"]).\n", " \n", "These two columns will be combined into a dictionary to update the trial_type column in the events file. This allows for relabeling of condition names in a BIDS-compliant way." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:50:02.177329Z", "iopub.status.busy": "2025-11-11T09:50:02.177173Z", "iopub.status.idle": "2025-11-11T09:50:02.183750Z", "shell.execute_reply": "2025-11-11T09:50:02.182891Z" } }, "outputs": [], "source": [ "_ = mapping_df.apply(bids.edit_events, axis=1, args=(destination_path,))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating sourcedata directory" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally there is this possiblity to keep your original data under sourcedata directory at your `destination_path`." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:50:02.185534Z", "iopub.status.busy": "2025-11-11T09:50:02.185329Z", "iopub.status.idle": "2025-11-11T09:50:02.199790Z", "shell.execute_reply": "2025-11-11T09:50:02.198950Z" } }, "outputs": [], "source": [ "bids.save_source(dataset_path, destination_path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Inspecting the results" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:50:02.201602Z", "iopub.status.busy": "2025-11-11T09:50:02.201424Z", "iopub.status.idle": "2025-11-11T09:50:02.206890Z", "shell.execute_reply": "2025-11-11T09:50:02.206162Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tmpahpn2f41/\n", "├─dataset_description.json\n", "├─sub-473/\n", "│ ├─sub-473_scans.tsv\n", "│ └─nirs/\n", "│ ├─sub-473_coordsystem.json\n", "│ ├─sub-473_optodes.json\n", "│ ├─sub-473_task-ballsqueezing_nirs.json\n", "│ ├─sub-473_optodes.tsv\n", "│ ├─sub-473_task-ballsqueezing_events.json\n", "│ ├─sub-473_task-ballsqueezing_channels.tsv\n", "│ ├─sub-473_task-ballsqueezing_events.tsv\n", "│ └─sub-473_task-ballsqueezing_nirs.snirf\n", "├─sub-474/\n", "│ ├─sub-474_scans.tsv\n", "│ └─nirs/\n", "│ ├─sub-474_coordsystem.json\n", "│ ├─sub-474_task-ballsqueezing_events.tsv\n", "│ ├─sub-474_task-ballsqueezing_channels.tsv\n", "│ ├─sub-474_optodes.json\n", "│ ├─sub-474_task-ballsqueezing_events.json\n", "│ ├─sub-474_task-ballsqueezing_nirs.json\n", "│ ├─sub-474_task-ballsqueezing_nirs.snirf\n", "│ └─sub-474_optodes.tsv\n", "├─participants.json\n", "├─participants.tsv\n", "└─sourcedata/\n", " ├─02262024_1100_473/\n", " │ └─2024-02-26_010/\n", " │ ├─2024-02-26_014_config.json\n", " │ ├─2024-02-26_014.snirf\n", " │ ├─2024-02-26_014_lsl.tri\n", " │ ├─2024-02-26_014_config.hdr\n", " │ ├─2024-02-26_014_calibration.json\n", " │ ├─2024-02-26_014_probeInfo.mat\n", " │ ├─2024-02-26_014_description.json\n", " │ ├─2024-02-26_014.wl2\n", " │ ├─2024-02-26_014.wl1\n", " │ └─digpts.txt\n", " ├─snirf2BIDS_mapping.csv\n", " ├─snirf2BIDS_mapping_edited.csv\n", " ├─02272024_1030_474/\n", " │ └─2024-02-27_010/\n", " │ ├─2024-02-27_010_config.json\n", " │ ├─2024-02-27_010_config.hdr\n", " │ ├─2024-02-27_010_calibration.json\n", " │ ├─2024-02-27_010.wl1\n", " │ ├─2024-02-27_010_probeInfo.mat\n", " │ ├─2024-02-27_010.snirf\n", " │ ├─2024-02-27_010_description.json\n", " │ ├─2024-02-27_010.wl2\n", " │ ├─2024-02-27_010_lsl.tri\n", " │ └─digpts.txt\n", " └─readme.txt\n" ] } ], "source": [ "seedir(destination_path)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:50:02.208363Z", "iopub.status.busy": "2025-11-11T09:50:02.208215Z", "iopub.status.idle": "2025-11-11T09:50:02.215391Z", "shell.execute_reply": "2025-11-11T09:50:02.214765Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
participant_idgenderage
0sub-473NaNNaN
1sub-474NaNNaN
\n", "
" ], "text/plain": [ " participant_id gender age\n", "0 sub-473 NaN NaN\n", "1 sub-474 NaN NaN" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "display(pd.read_table(destination_path / \"participants.tsv\"))" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:50:02.217120Z", "iopub.status.busy": "2025-11-11T09:50:02.216964Z", "iopub.status.idle": "2025-11-11T09:50:02.254239Z", "shell.execute_reply": "2025-11-11T09:50:02.253438Z" } }, "outputs": [ { "data": { "text/html": [ "
{\n",
       "  \"gender\": null,\n",
       "  \"age\": null\n",
       "}\n",
       "
\n" ], "text/plain": [ "\u001b[1m{\u001b[0m\n", " \u001b[1;34m\"gender\"\u001b[0m: \u001b[3;35mnull\u001b[0m,\n", " \u001b[1;34m\"age\"\u001b[0m: \u001b[3;35mnull\u001b[0m\n", "\u001b[1m}\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "with open(destination_path / \"participants.json\") as fin:\n", " print_json(fin.read())" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "execution": { "iopub.execute_input": "2025-11-11T09:50:02.255825Z", "iopub.status.busy": "2025-11-11T09:50:02.255662Z", "iopub.status.idle": "2025-11-11T09:50:02.260483Z", "shell.execute_reply": "2025-11-11T09:50:02.259725Z" } }, "outputs": [ { "data": { "text/html": [ "
{\n",
       "  \"Name\": \"snirf2bids_example_dataset\",\n",
       "  \"BIDSVersion\": \"1.10.0\",\n",
       "  \"License\": \"CC0\",\n",
       "  \"DatasetType\": \"raw\",\n",
       "  \"Authors\": [\n",
       "    \"Enter author names here\"\n",
       "  ],\n",
       "  \"Acknowledgements\": \"Enter acknowledgements here (e.g., funding sources, institutions).\",\n",
       "  \"HowToAcknowledge\": \"Provide details on how to cite or acknowledge this dataset.\",\n",
       "  \"DatasetDOI\": \"Enter DOI here if available.\",\n",
       "  \"Funding\": [\n",
       "    \"Enter funding details here, if applicable.\"\n",
       "  ],\n",
       "  \"EthicsApprovals\": [\n",
       "    \"Enter ethics approval details here, if applicable.\"\n",
       "  ],\n",
       "  \"ReferencesAndLinks\": [\n",
       "    \"Enter references or related links here, if applicable.\"\n",
       "  ]\n",
       "}\n",
       "
\n" ], "text/plain": [ "\u001b[1m{\u001b[0m\n", " \u001b[1;34m\"Name\"\u001b[0m: \u001b[32m\"snirf2bids_example_dataset\"\u001b[0m,\n", " \u001b[1;34m\"BIDSVersion\"\u001b[0m: \u001b[32m\"1.10.0\"\u001b[0m,\n", " \u001b[1;34m\"License\"\u001b[0m: \u001b[32m\"CC0\"\u001b[0m,\n", " \u001b[1;34m\"DatasetType\"\u001b[0m: \u001b[32m\"raw\"\u001b[0m,\n", " \u001b[1;34m\"Authors\"\u001b[0m: \u001b[1m[\u001b[0m\n", " \u001b[32m\"Enter author names here\"\u001b[0m\n", " \u001b[1m]\u001b[0m,\n", " \u001b[1;34m\"Acknowledgements\"\u001b[0m: \u001b[32m\"Enter acknowledgements here (e.g., funding sources, institutions).\"\u001b[0m,\n", " \u001b[1;34m\"HowToAcknowledge\"\u001b[0m: \u001b[32m\"Provide details on how to cite or acknowledge this dataset.\"\u001b[0m,\n", " \u001b[1;34m\"DatasetDOI\"\u001b[0m: \u001b[32m\"Enter DOI here if available.\"\u001b[0m,\n", " \u001b[1;34m\"Funding\"\u001b[0m: \u001b[1m[\u001b[0m\n", " \u001b[32m\"Enter funding details here, if applicable.\"\u001b[0m\n", " \u001b[1m]\u001b[0m,\n", " \u001b[1;34m\"EthicsApprovals\"\u001b[0m: \u001b[1m[\u001b[0m\n", " \u001b[32m\"Enter ethics approval details here, if applicable.\"\u001b[0m\n", " \u001b[1m]\u001b[0m,\n", " \u001b[1;34m\"ReferencesAndLinks\"\u001b[0m: \u001b[1m[\u001b[0m\n", " \u001b[32m\"Enter references or related links here, if applicable.\"\u001b[0m\n", " \u001b[1m]\u001b[0m\n", "\u001b[1m}\u001b[0m\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "\n", "with open(destination_path / \"dataset_description.json\") as fin:\n", " print_json(fin.read())" ] } ], "metadata": { "kernelspec": { "display_name": "cedalion_250922", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.14" } }, "nbformat": 4, "nbformat_minor": 2 }