资源预览内容
第1页 / 共61页
第2页 / 共61页
第3页 / 共61页
第4页 / 共61页
第5页 / 共61页
第6页 / 共61页
第7页 / 共61页
第8页 / 共61页
第9页 / 共61页
第10页 / 共61页
亲,该文档总共61页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
11/26/2003,Probability and Statistics for Teachers, Math 507, Lecture 13,1,GATHERING DATA The Nonmathematical Side of Statistics,11/26/2003,Probability and Statistics for Teachers, Math 507, Lecture 13,2,The Centrality of Data,Probability begins with axioms and models, not data. Statistics begins with data. After the statistics reform movement of the past decade most freshman statistics courses “emphasize” data. That is, they try to give the students some experience working with real world data sets. These data sets come printed in the back of the book or in supplementary diskettes or CDs, sometimes with software for performing simple statistical analysis.,11/26/2003,Probability and Statistics for Teachers, Math 507, Lecture 13,3,The Centrality of Data,Most freshman statistics texts have little to say about how to gather data. They generally have an introductory chapter or two talking about types of data (nominal/categorical, ordinal, interval, ratio), about the difference between population and sample, about types of samples (random, stratified, cluster, convenience), about the difference between experiments and observational studies, and about a couple of well-known statistical gaffes (e.g., Dewey Defeats Truman). The treatment, however, is often brief and lacking in insight.,11/26/2003,Probability and Statistics for Teachers, Math 507, Lecture 13,4,The Centrality of Data,Such courses give the impression that gathering data is a relatively easy part of statistical analysis. The course focuses on the analysis of data, implying that this is where the real work of the statistician lies. In fact, the gathering of good data is tremendously hard. The techniques of doing so are a major study in their own right. When we teach our students and ourselves to read statistics critically, the first question we should raise is, “How was the data collected?” It is much easier to get bad data than good, and bad data will produce bad results regardless of what mathematical tools we use to analyze it.,11/26/2003,Probability and Statistics for Teachers, Math 507, Lecture 13,5,Good Data: The Salk Polio Vaccine,The source for this information is chapters 1 and 2 of Statistics, 2e, by Freedman, Pisani, Purves, Adhikari, W.W. Norton & Company 1991, ISBN 0-393-96043-9. I highly recommend this book if you really want to understand statistics. It presents a great deal of good information clearly and readably.,11/26/2003,Probability and Statistics for Teachers, Math 507, Lecture 13,6,Good Data: The Salk Polio Vaccine,Polio first appeared in the U.S. in 1916. In 1954 the Public Health Service was ready to perform a large-scale field test of the vaccine developed by Jonas Salk. It had proved safe and effective in laboratory experiments.,11/26/2003,Probability and Statistics for Teachers, Math 507, Lecture 13,7,Good Data: The Salk Polio Vaccine,The goal of this test was to compare the incidence of polio among vaccinated children (the treatment group) with the incidence among non-vaccinated children (the control group). This is a common sort of statistical study. If we can somehow make the treatment and control groups identical in all ways except whether they receive treatment, then we can attribute any observed differences (e.g., different polio rates) to the treatment. The challenge is to make the two groups identical.,11/26/2003,Probability and Statistics for Teachers, Math 507, Lecture 13,8,Good Data: The Salk Polio Vaccine,Note, by the way, that we do not expect the vaccine to work perfectly. It will protect some children and not others. It will reduce the rate of polio but not to zero. The Public Health Service wanted to perform a test on children in grades one, two, and three, the most susceptible ages (in the end the test involved about 750,000 children). One plausible approach was to inoculate all the children and see if the polio rate dropped compared to the previous year. Polio, however, is an epidemic disease whose rates vary dramatically from year to year. If rates dropped, we would not know whether the vaccine was effective or it was simply a low-incidence year.,11/26/2003,Probability and Statistics for Teachers, Math 507, Lecture 13,9,Good Data: The Salk Polio Vaccine,Thus it was decided to vaccinate some of the children and leave others unvaccinated so as to be able to compare the groups during the same year. Is this unethical, however, intentionally leaving some children unprotected? The point is that we do not yet know how effective the vaccine is, and we do not know what risks it presents. In particular we do not know whether the benefits outweigh the risks.,11/26/2003,Probability and Statistics for Teachers, Math 507, Lecture 13,10,Good Data: The Salk Polio Vaccine,The next question is how to decide which children to vaccinate. First of all, we cannot vaccinate children without their parents approval. Perhaps we can just vaccinate the children whose parents approve and use those whose parents do not approve as our
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号