Re: [Pandas-dev] FW: [EXTERNAL] Re: pandas or new project
Have you guys seen these? https://www.python.org/dev/peps/pep-0589/ https://github.com/typelevel/frameless I would love to have TypedDataFrame implemented in pandas and then have my IDE introspect type errors. I have a large set of business processes like this: def gen_y(df: pd.DataFrame[x1, x2, x3, x4]) -> pd.DataFrame[x1, x2, x3, x4, y]: ... return df Food for thought... ᐧ On Wed, Jan 9, 2019 at 9:31 PM David M Rashty <David.Rashty@flagstar.com> wrote:
*From:* Wes McKinney [mailto:wesmckinn@gmail.com] *Sent:* Thursday, September 13, 2018 9:56 PM *To:* Tom Augspurger <tom.augspurger88@gmail.com> *Cc:* David M Rashty <David.Rashty@flagstar.com>; pandas-dev@python.org *Subject:* [EXTERNAL] Re: [Pandas-dev] pandas or new project
*Flagstar Security Warning:* External Email. Please make sure you trust this source before clicking links or opening attachments.
hi David,
There's nothing really wrong with injecting a bunch of custom methods into the DataFrame.* namespace. If you wanted, you could release your package as like
import pandas_stata
and then the new methods would be available. This is pretty common in large corporate environments that use pandas AFAICT. You can also propose your changes in pull requests to pandas.
- Wes
On Thu, Sep 13, 2018 at 9:41 PM Tom Augspurger <tom.augspurger88@gmail.com> wrote:
With respect to your `sdrop` and `skeep`, that's the goal of DataFrame.filter, though the name isn't the best so it'll
maybe be deprecated in favor of something better.
The rest sound interesting, but likely out of scope for pandas. If you build an open source library then we'd be
happy to include in pandas' ecosystem page: http://pandas.pydata.org/pandas-docs/stable/ecosystem.html <https://urldefense.proofpoint.com/v2/url?u=http-3A__pandas.pydata.org_pandas-2Ddocs_stable_ecosystem.html&d=DwMFaQ&c=6071WI5hme3qubAgsPInwSFFJUptGl1Ret_NIv4f0FM&r=IInR9ts5zJa2y9TCv1xkCBiNMNvWYuB88s6FL4QdKPQ&m=Yh52B0HOnjdaEtHlGjuSmivYPHIGG_RYsuh0b-93ELY&s=381O1pJzOg_Mvrmgl5CKUUTR9CSFh1VXi5zX4w33Kbc&e=>
Tom
On Thu, Sep 13, 2018 at 7:58 PM David M Rashty <David.Rashty@flagstar.com> wrote:
Dear pandas team,
I am a long time Stata user and I started using pandas about a year ago in order to build web applications using an in memory dataframe structure. As a business user, I’ve found Stata to have a key advantage over pandas that many others have also noted: much faster development time. Examples in Stata:
drop myvar* // drops all columns starting with myvar
keep myvar* // drops all columns except those starting with myvar
reg z y x // runs the regression z = a+bx+cy + error
In order to use pandas in a Stata-like fashion, I’ve had to monkey patch large parts of the library e.g.,
df = df.sdrop(‘myvar*’) # same as above
df = df.skeep(‘myvar*’) # same as above
df = df.sreg(‘z y x’) # same as above
df = df.squery(‘a>80 & b.str.contains(“hello”) & c.isin([1,2,3])’) # df.query doesn’t support str.contains and isin to my knowledge
I put an “s” in front of my methods to mean either “stata” or “sugar”.
Additionally, I’ve built a system to:
a) Automatically load new DataFrame methods into memory (no additional imports required)
b) A caching system to make loading data blazing fast along with a much tighter syntax e.g., pd.read_stata(‘mydata.dta’) (6 secs load time) vs use.mydata (0.001 secs load time after the first read from file)
c) A system of column “labels” and formats to prettify various reports e.g., df.sscatter(‘rate score’) produces a scatter plot with labels “Interest Rate, %” and “Credit Score”, respectively.
d) A reactive web app (using Flask/Redis) to quickly view the full DataFrame content in a browser:
Basically, I’ve tried to eliminate any obvious advantages Stata has over pandas.
I’m potentially interested in developing this project into something bigger. Would you like me to share my work in the context of pandas or should it be a completely separate project with a different scope?
Thanks,
David Rashty | Flagstar Bank | Whole Loan Trading | 248-312-6692 | david.rashty@flagstar.com
This e-mail may contain data that is confidential, proprietary or non-public personal information, as that term is defined in the Gramm-Leach-Bliley Act (collectively, Confidential Information). The Confidential Information is disclosed conditioned upon your agreement that you will treat it confidentially and in accordance with applicable law, ensure that such data isn't used or disclosed except for the limited purpose for which it's being provided and will notify and cooperate with us regarding any requested or unauthorized disclosure or use of any Confidential Information. By accepting and reviewing the Confidential information, you agree to indemnify us against any losses or expenses, including attorney's fees that we may incur as a result of any unauthorized use or disclosure of this data due to your acts or omissions. If a party other than the intended recipient receives this e-mail, he or she is requested to instantly notify us of the erroneous delivery and return to us all data so delivered.
_______________________________________________ Pandas-dev mailing list Pandas-dev@python.org https://mail.python.org/mailman/listinfo/pandas-dev <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.python.org_mailman_listinfo_pandas-2Ddev&d=DwMFaQ&c=6071WI5hme3qubAgsPInwSFFJUptGl1Ret_NIv4f0FM&r=IInR9ts5zJa2y9TCv1xkCBiNMNvWYuB88s6FL4QdKPQ&m=Yh52B0HOnjdaEtHlGjuSmivYPHIGG_RYsuh0b-93ELY&s=bLEIk941oO-TPAw9RBlbPeNXj8CTho6oZ91eR_Q9jyI&e=>
_______________________________________________ Pandas-dev mailing list Pandas-dev@python.org https://mail.python.org/mailman/listinfo/pandas-dev <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.python.org_mailman_listinfo_pandas-2Ddev&d=DwMFaQ&c=6071WI5hme3qubAgsPInwSFFJUptGl1Ret_NIv4f0FM&r=IInR9ts5zJa2y9TCv1xkCBiNMNvWYuB88s6FL4QdKPQ&m=Yh52B0HOnjdaEtHlGjuSmivYPHIGG_RYsuh0b-93ELY&s=bLEIk941oO-TPAw9RBlbPeNXj8CTho6oZ91eR_Q9jyI&e=>
participants (1)
-
David Rashty