read_matrix#

caf.toolkit.io.read_matrix(path, format_=None, find_similar=False)[source]#

Read matrix CSV in the square or long format.

Sorts the index and column names and makes sure they’re the same, doesn’t infill any NaNs created when reindexing.

Parameters:
  • path (os.PathLike) – Path to CSV file

  • format (str, optional) – Expected format of the matrix ‘square’ or ‘long’, if not given attempts to figure out the format by reading the top few lines of the file.

  • find_similar (bool, default False) – If True and the given file at path cannot be found, files with the same name but different extensions will be looked for and read in instead. Will check for: ‘.csv’, ‘.pbz2’

  • format_ (str | None)

Returns:

Matrix file in square format with sorted columns and indices

Return type:

pd.DataFrame

Raises:

ValueError – If the format cannot be determined by reading the file or an invalid format is given.