On Fri, Jul 31, 2020 at 10:59 PM Steven D'Aprano <steve@pearwood.info> wrote:
The first request in this thread was from Hans:


He is using a dict to hold an array of columns indexed by name
`{column_name: column}` and wanted to re-order and insert columns at
arbitrary positions.

Pandas solves for columnar data.
SQL is one source and destination for columnar data which Pandas supports.
Pandas handles NULL/None/NaN more consistently than dict.get("key", None).

assert df.columns.tolist() == ['a', 'b', 'c']

# this creates a copy
df2 = df[['b', 'c', 'a']]

# this doesn't create a copy
df.reindex(columns=['b', 'c', 'a'])

# this inserts a column after column b
df.insert(df.columns.get_loc('b'), 'newcolumn', df['c'])
df.insert(df.columns.tolist().index('b'), 'newcolumn2', df['c'])


If you need to reorder rows (or columns transposed (df.T)), you could select with .loc[[list, of, indexvalues]] or .iloc[[list, of, ints]]

To accomplish the same with the Python standard library, you'd need to create a data structure that is an amalgamation of list and dict: a MutableMapping with a Sequence interface for at least .keys() (or .keyseq()).

Is there already a "pure python" Dataframe?

Do whatever you want,
but the Pandas DataFrame API is also available in Dask, Modin, and CuDF; for distributed and larger-than-RAM use.