How to replace a cell value with each of its contour cells and yield the corresponding datasets seperately in a list according to a Pandas-way?
marc nicole
mk1853387 at gmail.com
Sun Jan 21 07:37:43 EST 2024
Hello,
I have an initial dataframe with a random list of target cells (each cell
being identified with a couple (x,y)).
I want to yield four different dataframes each containing the value of one
of the contour (surrounding) cells of each specified target cell.
the surrounding cells to consider for a specific target cell are : (x-1,y),
(x,y-1),(x+1,y);(x,y+1), specifically I randomly choose 1 to 4 cells from
these and consider for replacement to the target cell.
I want to do that through a pandas-specific approach without having to
define the contour cells separately and then apply the changes on the
dataframe (but rather using an all in one approach):
for now I have written this example which I think is not Pandas specific:
*def select_target_values(dataframe, number_of_target_values):
target_cells = [] for _ in range(number_of_target_values): row_x
= random.randint(0, len(dataframe.columns) - 1) col_y =
random.randint(0, len(dataframe) - 1) target_cells.append((row_x,
col_y)) return target_cellsdef select_contours(target_cells):
contour_coordinates = [(0, 1), (1, 0), (0, -1), (-1, 0)] contour_cells =
[] for target_cell in target_cells: # random contour count for
each cell contour_cells_count = random.randint(1, 4) try:
contour_cells.append( [tuple(map(lambda i, j: i + j,
(target_cell[0], target_cell[1]), contour_coordinates[iteration_]))
for iteration_ in range(contour_cells_count)]) except
IndexError: continue return contour_cellsdef
apply_contours(target_cells, contour_cells): target_cells_with_contour =
[] # create one single list of cells for idx, target_cell in
enumerate(target_cells): target_cell_with_contour = [target_cell]
target_cell_with_contour.extend(contour_cells[idx])
target_cells_with_contour.append(target_cell_with_contour)return
target_cells_with_contourdef create_possible_datasets(dataframe,
target_cells_with_contour): all_datasets_final = []
dataframe_original = dataframe.copy() #check for nans
list_tuples_idx_cells_all_datasets = list(filter(lambda x:
utils_tuple_list_not_contain_nan(x),
[list(tuples) for tuples in list(itertools.product(
*target_cells_with_contour))])) target_original_cells_coordinates =
list(map(lambda x: x[0],
[target_and_contour_cell for target_and_contour_cell in
target_cells_with_contour])) for
dataset_index_values in list_tuples_idx_cells_all_datasets:
all_datasets = [] for idx_cell in range(len(dataset_index_values)):
dataframe_cpy = dataframe.copy() dataframe_cpy.iat[
target_original_cells_coordinates[idx_cell][1],
target_original_cells_coordinates[idx_cell][ 0]] =
dataframe_original.iloc[dataset_index_values[idx_cell][1],
dataset_index_values[idx_cell][0]]
all_datasets.append(dataframe_cpy)
all_datasets_final.append(all_datasets) return all_datasets_final*
If you have a better Pandas approach (unifying all these methods into one
that make use of dataframe methods only) please let me know.
thanks!
More information about the Python-list
mailing list