A faster way of finding historical highs/lows
David Fraser
davidf at sjsoft.com
Fri Jun 11 17:42:02 EDT 2004
Eamonn Sullivan wrote:
> Peter Hansen <peter at engcorp.com> wrote in message news:<SN2dnT9ob92aPFTdRVn-gQ at powergate.ca>...
>
>>Eamonn Sullivan wrote:
>>
>>
>>>1. Find the most recent date when there was an equal or higher (or
>>>lower) value than X.
>>
>>The fastest algorithm might depend on how you use the data, as well.
>>For example, do you update the data often, and search it rarely,
>>or update it rarely and do the search very often? If searching
>>many times between updates, some preprocessing will likely make things
>>go much faster.
>>
>>Both of your examples sound to me like they would benefit by
>>using sort(), then a binary search. Sort is very fast relative
>>to things like the Python loops you are doing, so using it to
>>prepare the data before the search can be a good step.
>>
>>-Peter
>
>
> Thanks for this. At the moment, the software answers a few questions
> (highest/lowest, longest streak, etc.) once per retrieval of data. The
> database retrieval, though, is *by far* the biggest time sapper, so a
> little preprocessing would be almost unnoticeable in comparison,
> probably (depends on how much, of course).
>
> So, I'm guessing I could sort on the particular price field I'm using
> (decorate-sort-undecorate), and then just find the most recent date
> among the subset of data that meets the criteria (higher or lower). Is
> that what you mean? By binary search, do you mean further reorganizing
> the data into a binary tree using date?
If you have this in a relational database, you might find that the
database can answer the question quicker for you, using indexes if
neccessary ...
select max(xxx) from yyy where zzz > 40
with an index on xxx and zzz will usually be done quickly internally by
the database, and then you just get the result returned rather than
having to process it
David
More information about the Python-list
mailing list