question about csv.DictReader

Wed Apr 3 22:46:18 EDT 2013

On 04/04/2013 02:26, Norman Clerman wrote:
> Hello,
>
> I have the following python script (some of lines are wrapped):
>
> #! /usr/bin/env python
>
> import csv
>
> def dict_test_1():
>      """ csv test program  """
>
>      # Open the file Holdings_EXA.csv
>      HOLDING_FILE = 'Holdings_EXA.csv'
>      try:
>          csv_file = open(HOLDING_FILE, 'rt')
>      except IOError:
>          print('Problem opening {0}\nExiting').format(HOLDING_FILE)
>          exit()
>
>      # create a dictionary reader
>      try:
>          csv_reader = csv.DictReader(csv_file)
>      except NameError:
>          print('Cannot find file {0} to create a dictionary reader \nExiting').format(HOLDING_FILE)
>          exit()
>
>      # Print the keys in each row
>      i_row = 1
>      for row in csv_reader:
>          print ('There are {0} keys in row {1}').format(len(row.keys()), i_row)
>          print ('The keys in  row {0} are \n{1}').format(i_row, row.keys())
>          i_row += 1
> dict_test_1()
>
> Here are the lines in file Holdings_EXA.csv:
> Please note that the first field in the first row is "Holdings"
>
> "Holdings","Weighting","Type","Ticker","Style","First Bought","Shares Owned","Shares Change","Sector","Price","Day Change","Day high/low","Volume","52-Wk high/low","Country","3-Month Return","1-Year Return","3-Year Return","5-Year Return","Market Cap Mil","Currency","Morningstar Rating","YTD Return","P/E","Maturity Date","Coupon %","Yield to Maturity"
> "Nestle SA","1.91","EQUITY","NESN","Large Core","1999-12-31","3732276","197810","Consumer Defensive","67.65","-","67.75-67.35","1211531","67.75-53.8","Switzerland","10.42","21.25","10.5","8.84","213475.59","CHF","2","12.92","21.69","-","-","-"
> "HSBC Holdings PLC","1.75","EQUITY","HSBA","Large Value","1999-12-31","21120203","1711934","Financial Services","733.3","-1.4|-0","738.8-731","7839724","739.9-501.2","United Kingdom","14.51","37.17","3.88","2.77","132694.66","GBP","3","13.93","15.55","-","-","-"
> "Novartis AG","1.33","EQUITY","NOVN","Large Core","2003-06-30","2669523","206851","Healthcare","65.95","0.5|0.01","66-65.4","1121549","66-48.29","Switzerland","15.1","36.5","6.16","8.53","158671.66","CHF","4","16.7","17.76","-","-","-"
> "Roche Holding AG","1.31","EQUITY","ROG","Large Growth","2003-05-31","817830","59352","Healthcare","214.8","1.4|0.01","215.2-213.1","684173","220.4-148.4","Switzerland","17.45","37.95","7.78","4.09","34000","CHF","3","18.09","19.05","-","-","-"
>
> Finally, here are the results of running the script:
>
>
> norm at lima:~/python/overlap$ python dict_test_1.py
> There are 27 keys in row 1
> The keys in  row 1 are
> ['Style', 'Day Change', 'Coupon %', 'Yield to Maturity', 'P/E', 'Type', 'Weighting', 'Price', '3-Month Return', 'Volume', '\xef\xbb\xbf"Holdings"', 'Ticker', 'Shares Change', 'Shares Owned', 'YTD Return', '5-Year Return', 'Market Cap Mil', 'Country', '3-Year Return', 'Day high/low', 'Maturity Date', '1-Year Return', 'Sector', 'Morningstar Rating', 'Currency', '52-Wk high/low', 'First Bought']
> There are 27 keys in row 2
> The keys in  row 2 are
> ['Style', 'Day Change', 'Coupon %', 'Yield to Maturity', 'P/E', 'Type', 'Weighting', 'Price', '3-Month Return', 'Volume', '\xef\xbb\xbf"Holdings"', 'Ticker', 'Shares Change', 'Shares Owned', 'YTD Return', '5-Year Return', 'Market Cap Mil', 'Country', '3-Year Return', 'Day high/low', 'Maturity Date', '1-Year Return', 'Sector', 'Morningstar Rating', 'Currency', '52-Wk high/low', 'First Bought']
> There are 27 keys in row 3
> The keys in  row 3 are
> ['Style', 'Day Change', 'Coupon %', 'Yield to Maturity', 'P/E', 'Type', 'Weighting', 'Price', '3-Month Return', 'Volume', '\xef\xbb\xbf"Holdings"', 'Ticker', 'Shares Change', 'Shares Owned', 'YTD Return', '5-Year Return', 'Market Cap Mil', 'Country', '3-Year Return', 'Day high/low', 'Maturity Date', '1-Year Return', 'Sector', 'Morningstar Rating', 'Currency', '52-Wk high/low', 'First Bought']
> There are 27 keys in row 4
> The keys in  row 4 are
> ['Style', 'Day Change', 'Coupon %', 'Yield to Maturity', 'P/E', 'Type', 'Weighting', 'Price', '3-Month Return', 'Volume', '\xef\xbb\xbf"Holdings"', 'Ticker', 'Shares Change', 'Shares Owned', 'YTD Return', '5-Year Return', 'Market Cap Mil', 'Country', '3-Year Return', 'Day high/low', 'Maturity Date', '1-Year Return', 'Sector', 'Morningstar Rating', 'Currency', '52-Wk high/low', 'First Bought']
> norm at lima:~/python/overlap$
>
>
> Can anyone explain the presence of the characters "\xref\xbb\xbf" before the first field contents "Holdings" ?
>
Microsoft Windows indicates that a text file contains text encoded as
UTF-8 by including a signature at its start. (Does the file also have
"\r\n" line endings? Presumably it was created on a Windows system.)

Try opening the file with the "utf-8-sig" encoding instead; this will 
drop the signature if present.