Applied Predictive Analytics

Applied Predictive Analytics

ffi rs.indd 01:56:13:PM 03/28/2014 Page iv Applied Predictive Analytics Principles and Techniques for the Professional Data Analyst Dean Abbott ffi rs.indd 01:56:13:PM 03/28/2014 Page i Applied Predictive Analytics: Principles and Techniques for the Professional Data Analyst Published by John Wiley & Sons, Inc. 10475 Crosspoint Boulevard Indianapolis, IN 46256 www.wiley.com Copyright © 2014 by John Wiley & Sons, Inc., Indianapolis, Indiana Published simultaneously in Canada ISBN: 978-1-118-72796-6 ISBN: 978-1-118-72793-5 (ebk) ISBN: 978-1-118-72769-0 (ebk) Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1 No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or autho- rization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions. Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifi cally disclaim all warranties, including without limitation warranties of fi tness for a particular purpose. No warranty may be created or extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Web site is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or website may provide or recommendations it may make. Further, readers should be aware that Internet websites listed in this work may have changed or disap- peared between when this work was written and when it is read. For general information on our other products and services please contact our Customer Care Department within the United States at (877) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com. Library of Congress Control Number: 2013958302 Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affi liates, in the United States and other countries, and may not be used without written permission. [Insert third- party trademark information] All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book. ffi rs.indd 01:56:13:PM 03/28/2014 Page ii To Barbara ffi rs.indd 01:56:13:PM 03/28/2014 Page iii ffi rs.indd 01:56:13:PM 03/28/2014 Page iv About the Author Dean Abbott is President of Abbott Analytics, Inc. in San Diego, California. Dean is an internationally recognized data-mining and predictive analytics expert with over two decades of experience applying advanced modeling and data preparation techniques to a wide variety of real-world problems. He is also Chief Data Scientist of SmarterRemarketer, a startup company focusing on data-driven behavior segmentation and attribution. Dean has been recognized as a top-ten data scientist and one of the top ten most infl uential people in data analytics. His blog has been recognized as one of the top-ten predictive analytics blogs to follow. He is a regular speaker at Predictive Analytics World and other analytics conferences. He is on the advisory board for the University of California Irvine Certifi cate Program for predictive analytics and the University of California San Diego Certifi cate Program for data mining, and is a regular instructor for courses on predictive modeling algorithms, model deployment, and text min- ing. He has also served several times on the program committee for the KDD Conference Industrial Track. v ffi rs.indd 01:56:13:PM 03/28/2014 Page v ffi rs.indd 01:56:13:PM 03/28/2014 Page vi About the Technical Editor William J. Komp has a Ph.D. from the University of Wisconsin–Milwaukee, with a specialization in the fi elds of general relativity and cosmology. He has been a professor of physics at the University of Louisville and Western Kentucky University. Currently, he is a research scientist at Humana, Inc., working in the areas of predictive analytics and data mining. vii ffi rs.indd 01:56:13:PM 03/28/2014 Page vii ffi rs.indd 01:56:13:PM 03/28/2014 Page viii Credits Executive Editor Business Manager Robert Elliott Amy Knies Project Editor Vice President and Executive Adaobi Obi Tulton Group Publisher Richard Swadley Technical Editor William J. Komp Associate Publisher Jim Minatel Senior Production Editor Kathleen Wisor Project Coordinator, Cover Todd Klemme Copy Editor Nancy Rapoport Proofreader Nancy Carrasco Manager of Content Development and Assembly Indexer Mary Beth Wakefi eld Johnna VanHoose Dinse Director of Community Marketing Cover Designer David Mayhew Ryan Sneed Marketing Manager Ashley Zurcher ix ffi rs.indd 01:56:13:PM 03/28/2014 Page ix ffi rs.indd 01:56:13:PM 03/28/2014 Page x Acknowledgments The idea for this book began with a phone call from editor Bob Elliott, who pre- sented the idea of writing a different kind of predictive analytics book geared toward business professionals. My passion for more than a decade has been to teach principles of data mining and predictive analytics to business profes- sionals, translating the lingo of mathematics and statistics into a language the practitioner can understand. The questions of hundreds of course and work- shop attendees forced me to think about why we do what we do in predictive analytics. I also thank Bob for not only persuading me that I could write the book while continuing to be a consultant, but also for his advice on the scope and depth of topics to cover. I thank my father for encouraging me in analytics. I remember him teaching me how to compute batting average and earned run average when I was eight years old so I could compute my Little League statistics. He brought home reams of accounting pads, which I used for playing thousands of Strat-O-Matic baseball games, just so that I could compute everyone’s batting average and earned run average, and see if there were signifi cant differences between what I observed and what the players’ actual statistics were. My parents put up with a lot of paper strewn across the fl oor for many years. I would never have been in this fi eld were it not for Roger Barron, my fi rst boss at Barron Associates, Inc., a pioneer in statistical learning methods and a man who taught me to be curious, thorough, and persistent about data analysis. His ability to envision solutions without knowing exactly how they would be solved is something I refl ect on often. I’ve learned much over the past 27 years from John Elder, a friend and col- league, since our time at Barron Associates, Inc. and continuing to this day. I am very grateful to Eric Siegel for inviting me to speak regularly at Predictive xi ffi rs.indd 01:56:13:PM 03/28/2014 Page xi xii Acknowledgments Analytics World in sessions and workshops, and for his advice and encourage- ment in the writing of this book. A very special thanks goes to editors Adaobi Obi Tulton and Nancy Rapoport for making sense of what of I was trying to communicate and making this book more concise and clearer than I was able to do alone. Obviously, I was a math- ematics major and not an English major. I am especially grateful for technical editor William Komp, whose insightful comments throughout the book helped me to sharpen points I was making. Several software packages were used in the analyses contained in the book, including KNIME, IBM SPSS Modeler, JMP, Statistica, Predixion, and Orange. I thank all of these vendors for creating software that is easy to use. I also want to thank all of the other vendors I’ve worked with over the past two decades, who have supplied software for me to use in teaching and research. On a personal note, this book project could not have taken place without the support of my wife, Barbara, who encouraged me throughout the project, even as it wore on and seemed to never end.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    453 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us