[Tutor] Is there a better way to get a current mid-rate Yen quote with Python?
rdm at rcblue.com
Sat Jul 26 02:53:38 CEST 2008
At 04:11 PM 7/25/2008, Alan Gauld wrote:
>"Dick Moores" <rdm at rcblue.com> wrote
>>>Certainly Beautiful Soup will not be muh longer and a lot more
>>>elegant and probably more resilient.
>>Alan, expand a bit, please. Longer? Resilient?
>Longer as in lines of code. BS is good for extracting several
>different parts from the soup, but just to pull out one very
>specific item the setup and so on may mean that the framework
>actually works out the same or longer than your code.
>Resilient as in able to handle unexpected changes in the HTML used
>by the site or slight changes in formatting of the results etc.
>>>But to extract a single piece of text in a well defined location
>>>then your approach although somewhat crude will work just fine.
>>Crude? What's crude about my code?
>Its just a little bit too hard coded for my liking, all that
>splitting and searching means theres a lot of work going on
>to extract a small piece of data. You iterate over the whole page to
>do the first spliot,
Ah. I had the idea that it would be better to split it into its
natural lines, by line 7. But I could have just started with the
first relevant split.
> then you iterate over the whole thing again to find the line you
> want. Consider thios line:
>if 'JPY' in x and '>USDJPY=X<' in x:
>Sincy JPY is in both strings the first check is effectively
>redundant but still requires a string search over the line.
>A well crafted regex would probably be faster than the double in
>test and provide better checking by including
>allowances for extra spaces or case changes etc.
I'll try to do that.
>Then having found the string once with 'in' you then have to find it
>again with split().
Do you mean string x? After c = x (line 10), the next split is on c.
I'm not finding it again, am I?
> You could just have done a find the first time and stored the
> index as a basis for the slicing later.
>You also use lots of very specific slicing values to extract the
>data - thats where you lose resilience compared to a parser approach like BS.
Since I posted I found that I could add some resilience by extending
the end of the slice and then cutting back with an rstrip().
I haven't the slightest idea what a parser is. But I'll find out
while learning BS.
> Again I suspect a regex might work better in extracting the value.
> And hard coding the url in the function also adds to its fragility.
I can't imagine what else is possible?
>Stylistically all those single character variable names hurts
>readability and maintainability too.
I was at a loss as to what variable names to use, so I figured I'd
use a, b, c, .. in order, because I thought it was obvious that I was
narrowing the search for the yen rate. Could you give me an idea of
what names I could use?
>>I want to improve, so please tell
>It will work as is, but it could be tidied up a bit is all.
Thanks Alan, for your tough look at the code. I appreciate it.
Have you seen Kelie Feng's video introducing the terrific and free
IDE, Ulipad? <http://www.rcblue.com/u3/>
Get Ulipad 3.9 from <http://code.google.com/p/ulipad/downloads/list>
svn for the latest revision <http://ulipad.googlecode.com/svn/trunk/>
Mailing list for Ulipad: <http://groups-beta.google.com/group/ulipad>
More information about the Tutor