SParse feature vector generation
Hi numpy users, *Is there a convenient way in numpy to go from "string" features like:* "uc_berkeley", "google", 1 "stanford", "intel", 1 . . . "uiuc", "texas_instruments", 0 *to a numpy matrix like:* "uc_berkeley", "stanford", ..., "uiuc", "google", "intel", "texas_instruments", "bool" 1 0 ... 0 1 0 0 1 0 1 ... 0 0 1 0 1 : 0 0 ... 1 0 0 1 0 I really appreciate you taking the time to help! Thanks! --Dhruv
I would just use a lookup dict: names = [ "uc_berkeley", "stanford", "uiuc", "google", "intel", "texas_instruments", "bool"] lookup = dict( zip( range(len(names)), names ) ) Now, given you have n entries: S = numpy.zeros( (n, len(names)) ,dtype=numpy.int32) for k in ["uc_berkeley", "google", "bool"]: S[0,lookup[k]] += 1 for k in ["stanford", "intel","bool"]: S[1,lookup[k]] += 1 ... and so forth. so lookup[k] returns the index to use. Hope this helps. I am not aware of an automatic that does this. I may be wrong. cheers, Samuel On 04.01.2012, at 07:25, Dhruvkaran Mehta wrote:
Hi numpy users,
Is there a convenient way in numpy to go from "string" features like:
"uc_berkeley", "google", 1 "stanford", "intel", 1 . . . "uiuc", "texas_instruments", 0
to a numpy matrix like:
"uc_berkeley", "stanford", ..., "uiuc", "google", "intel", "texas_instruments", "bool" 1 0 ... 0 1 0 0 1 0 1 ... 0 0 1 0 1 : 0 0 ... 1 0 0 1 0
I really appreciate you taking the time to help! Thanks! --Dhruv
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (2)
-
Dhruvkaran Mehta
-
Samuel John