[Matplotlib-users] Reshaping Tabular Data to Column Data

Scott Lasley selasley at icloud.com
Tue Mar 24 21:28:18 EDT 2020


Does this do what you want?

# read data from noaa, set index to first column and column names to '00' - '12'
df = pd.read_csv('https://www.esrl.noaa.gov/psd/data/correlation/pna.data', 
                 skiprows=1, 
                 skipfooter=3,
                 index_col=[0], 
                 header=None,
                 names=['{:02d}'.format(i) for i in range(13)],
                 delim_whitespace=True)
# unstack the dataframe to get a Series with  and a multiindex 
# consisting of the date and column name
df = df.unstack()
# sort index by [date, column name] to "group" the dates
df = df.sort_index(level=[1,0])
# create a new index using the existing multiindex levels
ndx = [int(''.join([y,c])) 
       for y in df.index.levels[1].astype(str) 
       for c in df.index.levels[0]]
df.index = ndx
df.head(28)

194801   -99.90
194802   -99.90
194803   -99.90
...
194911   -99.90
194912   -99.90
195001    -3.65
195002    -1.69
195003    -0.06
195004    -0.23

df.tail(12)

202001    -0.95
202002    -0.07
202003   -99.90
202004   -99.90
202005   -99.90
202006   -99.90
202007   -99.90
202008   -99.90
202009   -99.90
202010   -99.90
202011   -99.90
202012   -99.90

Note that the 202001 and 202002 values do not agree with the values in your email message but do agree with the values on the website.

Hth,
Scott


> On Mar 24, 2020, at 6:12 PM, Benson, Randall <Randall.Benson at avangrid.com> wrote:
> 
> Hi –
> I seem to be stuck here trying to convert downloaded data that is in tabular form since I need to analyze it in single column format. The data is from here and I’m using read_csv as shown below – 
>  
> pnadata1= pd.read_csv('https://www.esrl.noaa.gov/psd/data/correlation/pna.data')
>  
> The downloaded data looks like this with the values in months 1-12 across the columns below for each year 1948-present –
>  
>                                     1948        2020
> 0   1948 -99.90 -99.90 -99.90 -99.90 -99.90 -99.9...
> 1   1949 -99.90 -99.90 -99.90 -99.90 -99.90 -99.9...
> 2   1950  -3.65  -1.69  -0.06  -0.23  -0.40  -1.9...
> 3   1951  -1.18  -2.11  -1.09   0.47   1.19  -0.5...
> 4   1952  -3.19   0.08  -0.69   0.89  -0.79   0.5...
>  
> And, I need it to be reshaped to look like this in yyyymm values format so I can compare it easily to other downloaded data in the column format –
>  
> 194801 -99.90
> 194802 -99.90
> 194803 -99.90
> ...
> 202001 -3.45
> 202002  0.87
>  
> I’ve tried working with “melt” and “group” in different combinations but cannot figure it out given the online references. Thank you,
> Randall
>  
>  
> <image001.jpg>
> Randall P. Benson, PhD 
> Wind Asset Meteorologist, Energy Resource - Onshore
> 
> 1125 NW Couch Street, Suite 700
> Portland, Oregon, USA 97209 
> Telephone  503-796-7129
> Cell  971-227-2477 
> randall.benson at avangrid.com
> 
> <image002.png>
> 
> In the interest of the environment, 
> please print only if necessary and recycle. 
>  
>  
>  
> ==============================================================
> 
> Please consider the environment before printing this email.
> 
> If you have received this message in error, please notify the sender and immediately delete this message and any attachment hereto and/or copy hereof, as such message contains confidential information intended solely for the individual or entity to whom it is addressed. The use or disclosure of such information to third parties is prohibited by law and may give rise to civil or criminal liability.
> 
> The views presented in this message are solely those of the author(s) and do not necessarily represent the opinion of Avangrid Renewables, LLC. or any company of its group. Neither Avangrid Renewables, LLC. nor any company of its group guarantees the integrity, security or proper receipt of this message. Likewise, neither Avangrid Renewables, LLC. nor any company of its group accepts any liability whatsoever for any possible damages arising from, or in connection with, data interception, software viruses or manipulation by third parties.
> 
>  ==============================================================
> 
> _______________________________________________
> Matplotlib-users mailing list
> Matplotlib-users at python.org
> https://mail.python.org/mailman/listinfo/matplotlib-users



More information about the Matplotlib-users mailing list