[Tutor] Is there a better way to get a current mid-rate Yen quote with Python?

Dick Moores rdm at rcblue.com
Sat Jul 26 02:53:38 CEST 2008


At 04:11 PM 7/25/2008, Alan Gauld wrote:
>"Dick Moores" <rdm at rcblue.com> wrote
>>>Certainly Beautiful Soup will not be muh longer and a lot more 
>>>elegant and probably more resilient.
>>Alan, expand a bit, please. Longer? Resilient?
>
>Longer as in lines of code. BS is good for extracting several 
>different parts from the soup, but just to pull out one very 
>specific item the setup and so on may mean that the framework 
>actually works out the same or longer than your code.
>
>Resilient as in able to handle unexpected changes in the HTML used 
>by the site or slight changes in formatting of the results etc.
>
>>>But to extract a single piece of text in a well defined location 
>>>then your approach although somewhat crude will work just fine.
>>Crude? What's crude about my code?
>
>Its just a little bit too hard coded for my liking, all that 
>splitting and searching means theres a lot of work going on
>to extract a small piece of data. You iterate over the whole page to 
>do the first spliot,

Ah. I had the idea that it would be better to split it into its 
natural lines, by line 7.  But I could have just started with the 
first relevant split.

>  then you iterate over the whole thing again to find the line you 
> want. Consider thios line:
>
>if 'JPY' in x and '>USDJPY=X<' in x:
>
>Sincy JPY is in both strings the first check is effectively 
>redundant but still requires a string search over the line.
>A well crafted regex would probably be faster than the double in 
>test and provide better checking by including
>allowances for extra spaces or case changes etc.

I'll try to do that.

>Then having found the string once with 'in' you then have to find it 
>again with split().

Do you mean string x? After c = x (line 10), the next split is on c. 
I'm not finding it again, am I?

>  You could just have done a find the first time and stored the 
> index as a basis for the slicing later.
>
>You also use lots of very specific slicing values to extract the 
>data - thats where you lose resilience compared to a parser approach like BS.

Since I posted I found that I could add some resilience by extending 
the end of the slice and then cutting back with an rstrip().

I haven't the slightest idea what a parser is. But I'll find out 
while learning BS.

>  Again I suspect a regex might work better in extracting the value. 
> And hard coding the url in the function also adds to its fragility.

I can't imagine what else is possible?

>Stylistically all those single character variable names hurts 
>readability and maintainability too.

I was at a loss as to what variable names to use, so I figured I'd 
use a, b, c, .. in order, because I thought it was obvious that I was 
narrowing the search for the yen rate. Could you give me an idea of 
what names I could use?

>>I want to improve, so please tell
>
>It will work as is, but it could be tidied up a bit is all.

Thanks Alan, for your tough look at the code. I appreciate it.

Dick
===================================================
Have you seen Kelie Feng's video introducing the terrific and free
IDE, Ulipad? <http://www.rcblue.com/u3/>
Get Ulipad 3.9 from <http://code.google.com/p/ulipad/downloads/list>
svn for the latest revision <http://ulipad.googlecode.com/svn/trunk/>
Mailing list for Ulipad: <http://groups-beta.google.com/group/ulipad>



More information about the Tutor mailing list