chembee.preparation package
Submodules
chembee.preparation.processing module
- chembee.preparation.processing.calculate_lipinski_desc(data_set: DataFrame, mols: Series) DataFrame [source]
The calculate_lipinski_desc function calculates the molecular descriptors for each molecule in a data set. The function takes two arguments: 1) data_set - A pandas DataFrame containing the SMILES strings of each molecule in a data set. 2) mols - A pandas Series containing the RDKit Mol object of each molecule in a data set.
The function returns one value: 1) The original DataFrame with additional columns for each descriptor, which contain floats.
- Parameters
data_set:pd.DataFrame – Used to Store the calculated descriptors.
mols:pd.Series – Used to Pass a series of molecules to the function.
- Returns
A dataframe with the descriptors for each molecule in the mols series.
- Doc-author
Trelent
- chembee.preparation.processing.calculate_mordred_descriptors(mols: list) DataFrame [source]
The calculate_mordred_descriptors function calculates Mordred descriptors for a set of molecules. The function takes as input a list of RDKit molecule objects and returns the calculated descriptors in a pandas dataframe.
- Parameters
mols – Used to specify the molecules for which descriptors are to be calculated.
- Returns
A pandas dataframe containing the calculated descriptors.
- Doc-author
Julian M. Kleber
- chembee.preparation.processing.convert_mol_to_inchi(mols)[source]
The convert_mol_to_inchi function converts a molecule into an InChI string.
- Parameters
mol – Used to pass the molecule to the function.
- Returns
A tuple of the inchi and inchikey.
- Doc-author
Julian M. Kleber
- chembee.preparation.processing.convert_mols_to_morgan_fp(mols, radius=3, n_bits=2048, return_bit=False)[source]
The convert_mols_to_morgan_fp function converts a list of molecules to their Morgan fingerprints.
- Parameters
mols – Used to specify the molecules that are to be converted into Morgan fingerprints.
radius=3 – Used to determine the number of bits that will be used to represent each molecule.
n_bits=2048 – Used to determine the number of bits in the Morgan fingerprint.
- Returns
A list of fingerprints.
- Doc-author
Julian M. Kleber
- chembee.preparation.processing.convert_mols_to_rdk_fp(mols, return_bit=False)[source]
The convert_mols_to_rdkfingerprint function converts a list of RDKit molecules into a list of RDKit fingerprints.
- Parameters
mols – Used to pass the molecules that are to be converted into fingerprints.
- Returns
A list of rdkit fingerprints.
- Doc-author
Julian M. Kleber
- chembee.preparation.processing.get_mols_from_supplier(indices, supplier)[source]
The get_mols_from_supplier function takes a list of indices and a supplier object, and returns the molecules corresponding to those indices.
- Parameters
indices – Used to specify which molecules to retrieve from the supplier.
supplier – Used to specify the supplier of the molecules.
- Returns
A list of molecules.
- Doc-author
Julian M. Kleber
- chembee.preparation.processing.load_data(file_path: str)[source]
The load_data function loads the data from a file and returns it as a Pandas DataFrame.
- Parameters
file_path:str – Used to specify the location of the.
- Returns
A dataframe with the columns specified by the sdf data entries
- Doc-author
Julian M. Kleber