How to create a binary tree hierarchy given a list of elements as its leaves
marc nicole
mk1853387 at gmail.com
Sun Jan 28 13:16:10 EST 2024
So I am trying to build a binary tree hierarchy given numerical elements
serving for its leaves (last level of the tree to build). From the leaves I
want to randomly create a name for the higher level of the hierarchy and
assign it to the children elements. For example: if the elements inputted
are `0,1,2,3` then I would like to create firstly 4 elements (say by random
giving them a label composed of a letter and a number) then for the second
level (iteration) I assign each of 0,1 to a random name label (e.g. `b1`)
and `2,3` to another label (`b2`) then for the third level I assign a
parent label to each of `b1` and `b2` as `c1`.
An illustation of the example is the following tree:
[image: tree_exp.PNG]
For this I use numpy's `array_split()` to get the chunks of arrays based on
the iteration needs.
for example to get the first iteration arrays I use `np.array_split(input,
(input.size // k))` where `k` is an even number. In order to assign a
parent node to the children the array range should enclose the children's.
For example to assign the parent node with label `a1` to children `b1` and
`b2` with range respectively [0,1] and [2,3], the parent should have the
range [0,3].
All is fine until a certain iteration (k=4) returns parent with range [0,8]
which is overlapping to children ranges and therefore cannot be their
parent.
My question is how to evenly partition such arrays in a binary way and
create such binary tree so that to obtain for k=4 the first range to be
[0,7] instead of [0,8]?
My code is the following:
#!/usr/bin/python
# -*- coding: utf-8 -*-
import string
import random
import numpy as np
def generate_numbers_list_until_number(stop_number):
if str(stop_number).isnumeric():
return np.arange(stop_number)
else:
raise TypeError('Input should be a number!')
def generate_node_label():
return random.choice(string.ascii_lowercase) \
+ str(random.randint(0, 10))
def main():
data = generate_numbers_list_until_number(100)
k = 1
hierarchies = []
cells_arrays = np.array_split(data, data.size // k)
print cells_arrays
used_node_hierarchy_name = []
node_hierarchy_name = [generate_node_label() for _ in range(0,
len(cells_arrays))]
used_node_hierarchy_name.extend(node_hierarchy_name)
while len(node_hierarchy_name) > 1:
k = k * 2
# bug here in the following line
cells_arrays = list(map(lambda x: [x[0], x[-1]],
np.array_split(data, data.size // k)))
print cells_arrays
node_hierarchy_name = []
# node hierarchy names should not be redundant in another level
for _ in range(0, len(cells_arrays)):
node_name = generate_node_label()
while node_name in used_node_hierarchy_name:
node_name = generate_node_label()
node_hierarchy_name.append(node_name)
used_node_hierarchy_name.extend(node_hierarchy_name)
print used_node_hierarchy_name
hierarchies.append(list(zip(node_hierarchy_name, cells_arrays)))
More information about the Python-list
mailing list