Recoding Categorical to Numerical
Peter Otten
__peter__ at web.de
Sat Aug 14 04:31:34 EDT 2021
On 14/08/2021 00:33, John Griner wrote:
> Hello, and thanks in advance,
>
> I am trying to recode categorical variable to numeric. Despite using
> lambda, If else, dummy-recoding, and so forth, what I get is from ONE
> column that has 4 variables (City, Town, Suburb, Rural), I get FOUR columns:
>
>
>
> localeDummy_City localeDummy_Town localeDummy_Suburb
> localeDummy_Rural locale_recode
>
>
>
> with the corresponding numeric variable.
>
>
>
> What I want is the recode to have ONE column that has the numeric
> conversion.
>
>
>
> For instance:
>
>
> local_recode
>
> 2
>
> 4
>
> 4
>
> 6
>
> 2
>
> 8
>
> 6
>
> 2
>
> 8
>
> 2
>
> 2
>
> 4
>
> 6
>
> 4
>
> 8
>
> and so forth, where I have set City to 2, and Town to 4, etc.
>
>
> Again, thanks, John
My crystal ball says you want
import pandas
df = pandas.DataFrame(
[
[("City", "Suburb")],
[("Town", "City", "Suburb")],
[("Rural",)]
],
columns=["before"]
)
flags = dict(
City=1,
Town=2,
Suburb=4,
Rural=8
)
df["after"] = df["before"].apply(
lambda names: sum(flags[name] for name in set(names))
)
print(df)
If that's not it show us your failing code, preferably as a small
self-contained script that also generates the required input data. Use
cut and paste, and include it into the message body as attachments are
usually removed.
More information about the Python-list
mailing list