df_util

Utilities for assembly and conversion of HED strings to different forms.

Functions

convert_to_form(df, hed_schema, tag_form[, ...])

Convert all tags in underlying dataframe to the specified form (in place).

expand_defs(df, hed_schema, def_dict[, columns])

Expands any def tags found in the dataframe.

get_assembled(tabular_file, hed_schema[, ...])

Create an array of assembled HedString objects (or list of these) of the same length as tabular file input.

process_def_expands(hed_strings, hed_schema)

Gather def-expand tags in the strings/compare with known definitions to find any differences.

shrink_defs(df, hed_schema[, columns])

Shrink (in place) any def-expand tags found in the specified columns in the dataframe.

sort_dataframe_by_onsets(df)

Gather def-expand tags in the strings/compare with known definitions to find any differences.

convert_to_form(df, hed_schema, tag_form, columns=None)[source]

Convert all tags in underlying dataframe to the specified form (in place).

Parameters:
  • df (pd.Dataframe or pd.Series) – The dataframe or series to modify.

  • hed_schema (HedSchema) – The schema to use to convert tags.

  • tag_form (str) – HedTag property to convert tags to.

  • columns (list) – The columns to modify on the dataframe.

expand_defs(df, hed_schema, def_dict, columns=None)[source]

Expands any def tags found in the dataframe.

Converts in place

Parameters:
  • df (pd.Dataframe or pd.Series) – The dataframe or series to modify.

  • hed_schema (HedSchema or None) – The schema to use to identify defs.

  • def_dict (DefinitionDict) – The definitions to expand.

  • columns (list or None) – The columns to modify on the dataframe.

get_assembled(tabular_file, hed_schema, extra_def_dicts=None, defs_expanded=True)[source]

Create an array of assembled HedString objects (or list of these) of the same length as tabular file input.

Parameters:
  • tabular_file (TabularInput) – Represents the tabular input file.

  • hed_schema (HedSchema) – If str, will attempt to load as a version if it doesn’t have a valid extension.

  • extra_def_dicts – list of DefinitionDict, optional Any extra DefinitionDict objects to use when parsing the HED tags.

  • defs_expanded (bool) – (Default True) Expands definitions if True, otherwise shrinks them.

Returns:

hed_strings(list of HedStrings): A list of HedStrings or a list of lists of HedStrings def_dict(DefinitionDict): The definitions from this Sidecar.

Return type:

tuple

process_def_expands(hed_strings, hed_schema, known_defs=None, ambiguous_defs=None)[source]

Gather def-expand tags in the strings/compare with known definitions to find any differences.

Parameters:
  • hed_strings (list or pd.Series) – A list of HED strings to process.

  • hed_schema (HedSchema) – The schema to use.

  • known_defs (DefinitionDict or list or str or None) – A DefinitionDict or anything its constructor takes. These are the known definitions going in, that must match perfectly.

  • ambiguous_defs (dict) – A dictionary containing ambiguous definitions. format TBD. Currently def name key: list of lists of HED tags values

Returns:

A tuple containing the DefinitionDict, ambiguous definitions, and errors.

Return type:

tuple

shrink_defs(df, hed_schema, columns=None)[source]

Shrink (in place) any def-expand tags found in the specified columns in the dataframe.

Parameters:
  • df (pd.Dataframe or pd.Series) – The dataframe or series to modify.

  • hed_schema (HedSchema or None) – The schema to use to identify defs.

  • columns (list or None) – The columns to modify on the dataframe.

sort_dataframe_by_onsets(df)[source]

Gather def-expand tags in the strings/compare with known definitions to find any differences.

Parameters:

df (pd.Dataframe) – Dataframe to sort.

Returns:

The sorted dataframe, or the original dataframe if it didn’t have an onset column.