chembee.preparation package

Submodules

chembee.preparation.processing module

chembee.preparation.processing.calculate_lipinski_desc(data_set: DataFrame, mols: Series) DataFrame[source]

The calculate_lipinski_desc function calculates the molecular descriptors for each molecule in a data set. The function takes two arguments: 1) data_set - A pandas DataFrame containing the SMILES strings of each molecule in a data set. 2) mols - A pandas Series containing the RDKit Mol object of each molecule in a data set.

The function returns one value: 1) The original DataFrame with additional columns for each descriptor, which contain floats.

Parameters
  • data_set:pd.DataFrame – Used to Store the calculated descriptors.

  • mols:pd.Series – Used to Pass a series of molecules to the function.

Returns

A dataframe with the descriptors for each molecule in the mols series.

Doc-author

Trelent

chembee.preparation.processing.calculate_mordred_descriptors(mols: list) DataFrame[source]

The calculate_mordred_descriptors function calculates Mordred descriptors for a set of molecules. The function takes as input a list of RDKit molecule objects and returns the calculated descriptors in a pandas dataframe.

Parameters

mols – Used to specify the molecules for which descriptors are to be calculated.

Returns

A pandas dataframe containing the calculated descriptors.

Doc-author

Julian M. Kleber

chembee.preparation.processing.convert_mol_to_inchi(mols)[source]

The convert_mol_to_inchi function converts a molecule into an InChI string.

Parameters

mol – Used to pass the molecule to the function.

Returns

A tuple of the inchi and inchikey.

Doc-author

Julian M. Kleber

chembee.preparation.processing.convert_mols_to_morgan_fp(mols, radius=3, n_bits=2048, return_bit=False)[source]

The convert_mols_to_morgan_fp function converts a list of molecules to their Morgan fingerprints.

Parameters
  • mols – Used to specify the molecules that are to be converted into Morgan fingerprints.

  • radius=3 – Used to determine the number of bits that will be used to represent each molecule.

  • n_bits=2048 – Used to determine the number of bits in the Morgan fingerprint.

Returns

A list of fingerprints.

Doc-author

Julian M. Kleber

chembee.preparation.processing.convert_mols_to_rdk_fp(mols, return_bit=False)[source]

The convert_mols_to_rdkfingerprint function converts a list of RDKit molecules into a list of RDKit fingerprints.

Parameters

mols – Used to pass the molecules that are to be converted into fingerprints.

Returns

A list of rdkit fingerprints.

Doc-author

Julian M. Kleber

chembee.preparation.processing.get_mols_from_supplier(indices, supplier)[source]

The get_mols_from_supplier function takes a list of indices and a supplier object, and returns the molecules corresponding to those indices.

Parameters
  • indices – Used to specify which molecules to retrieve from the supplier.

  • supplier – Used to specify the supplier of the molecules.

Returns

A list of molecules.

Doc-author

Julian M. Kleber

chembee.preparation.processing.load_data(file_path: str)[source]

The load_data function loads the data from a file and returns it as a Pandas DataFrame.

Parameters

file_path:str – Used to specify the location of the.

Returns

A dataframe with the columns specified by the sdf data entries

Doc-author

Julian M. Kleber

Module contents