protdata.io.read_maxquant#
- protdata.io.read_maxquant(file, intensity_column_prefixes=['LFQ intensity ', 'Intensity ', 'MS/MS count '], index_column='Protein IDs', filter_columns=['Only identified by site', 'Reverse', 'Potential contaminant'], sep='\\t')#
 Load MaxQuant proteinGroups.txt into an AnnData object.
- Parameters:
 - file 
Union[str,DataFrame] Path to the MaxQuant proteinGroups.txt file or a pandas DataFrame containing the data.
- intensity_column_prefixes 
Union[List[str],str] (default:['LFQ intensity ', 'Intensity ', 'MS/MS count ']) Prefix(es) for intensity columns to extract. The first prefix is used for the main matrix (X), others are stored as layers if present.
- index_column 
str(default:'Protein IDs') Column name to use as protein index.
- filter_columns 
list[str] (default:['Only identified by site', 'Reverse', 'Potential contaminant']) Columns to use for filtering out contaminants or unwanted entries.
- sep 
str(default:'\\t') File separator if reading from file.
- file 
 - Return type:
 - Returns:
 anndata.AnnDataobject with:X: intensity matrix (samples x proteins)var: protein metadata (indexed by protein IDs)obs: sample metadata (indexed by sample names)layers: additional intensity matrices if multiple intensity column prefixes are provided
Notes
The first intensity column prefix is used for the main matrix (X), others are stored as layers if present.
Forward slashes (
/) are not allowed in hdf5 keys, so they are replaced with underscores (_).