[Tutor] (no subject)
Alan Gauld
alan.gauld at btinternet.com
Mon Apr 18 14:11:13 EDT 2016
On 18/04/16 17:52, Halema wrote:
> Hello,
> I have a project and need someone to help below you will see details about it ,
> please if you able to help email me as soon as possible
> and how much will cost !
The good news is it doesn't cost anything but time.
The bad news is that we won;t do your homework for you, you need to make
an attempt then tell us what doesn't work, where you are stuck etc.
Send us your code and any error messages and a note of your OS and
Python version. Make sure you use plain text email since HTML often gets
mangled in transit.
Some of us will then respond with hints and tips.
> You will download data files: 2010 U.S. Mortality Data and ICD10 code file. Both of them are freely available from the CDC website:
> 2010 U.S. Mortality data
> ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Datasets/DVS/mortality/mort2010us.zip
> ICD 10 code and description file:
> ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Publications/ICD10/allvalid2009(detailed titles headings).txt
> For the second file, you are allowed to preprocess it before you analyze it. For instance, you may open the text file in Microsoft Excel and only keep the two needed columns (ICD10 code and the corresponding description) and remove all the other columns.
> In this project, you are required to extract the following data items for each entry from the mortality data file: sex, age, race, education, marital status, manner of death, and ICD10 code for the reason of death. You are then required to analyze the extracted data and answer the following questions:
> 1) The male to female ratio (10 points)
> 2) The distribution of age. You may split all people into 12 groups according to their age: 0, 1-10, 11-20, 21-30, 31-40, 41-50, 51-60, 61-70, 71-80, 81-90, 91-100, > 100. You may then count how many people were in each group. (10 points)
> 3) The distribution of race. Similarly, you may categorize all people into groups according to their race [male, female, unknown] and report how many people were in each group. (10 points)
> 4) The distribution of education. Similar as above. (10 points)
> 5) The distribution of marital status. Similar as above. (10 points)
> 6) The distribution of manner of death. Similar as above. (10 points)
> 7) The top 10 leading cause of death. You may first figure out the top 10 leading cause of death by counting the occurrence of the ICD10 code first, then determine the corresponding description about the code from the ICD10 code dictionary. (10 points)
> 8) Correlation between education and death age. To calculate correlation coefficient, you should convert both data columns into integers. (10 points)
> 9) Correlation between race and death age. Similar as above. (10 points)
> 10) Correlation between marital status and death age. Similar as above (10 points)
> Hint: For question 2, 3, 4, 5, and 6, you may create a function to finish the task since they have some common parts. (Not mandatory)
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos
More information about the Tutor
mailing list