[Tutor] weird lambda expression -- can someone help me understand how this works

Sat Dec 14 03:29:54 CET 2013

On Sat, Dec 14, 2013 at 12:14 PM, Michael Crawford <dalupus at gmail.com> wrote:
> I found this piece of code on github
>
> https://gist.github.com/kljensen/5452382
>
> def one_hot_dataframe(data, cols, replace=False):
>     """ Takes a dataframe and a list of columns that need to be encoded.
>         Returns a 3-tuple comprising the data, the vectorized data,
>         and the fitted vectorizor.
>     """
>     vec = DictVectorizer()
>     mkdict = lambda row: dict((col, row[col]) for col in cols)
> #<<<<<<<<<<<<<<<<<<
>     vecData = pandas.DataFrame(vec.fit_transform(data[cols].apply(mkdict,
> axis=1)).toarray())
>     vecData.columns = vec.get_feature_names()
>     vecData.index = data.index
>     if replace is True:
>         data = data.drop(cols, axis=1)
>         data = data.join(vecData)
>     return (data, vecData, vec)
>
> I don't understand how that lambda expression works.
> For starters where did row come from?
> How did it know it was working on data?

Consider this simple example:

>>> l = lambda x: x**2
>>> apply(l, (3,))
9

A lambda is an anonymous function. So, when you use apply(), the
lambda, l gets the value 3 in x and then returns x**2 which is 9 in
this case.

Hope this helps you.

Best,
Amit.