Copyrighted Material
Total Page:16
File Type:pdf, Size:1020Kb
index A NFC (near field communication), A/B testing, 82–84 91–93 ACID (atomicity, consistency, isolation, RFID (radio-frequency durability) principles, 33 identification), 91–93 adjacent possible, 201–202 sensors, 90–91 Adner, Ron, The Wide Lens: A New Strategy for Innovation, 202 B The Age of the Platform (Simon), xv bad data, 162–163 Amazon Bayesian methods of analysis, 81–82 Best Buy price matching, 198 Beane, Billy, xv–xvi Big Data statistics, 98–99 Howe, Art, and, xvii–xviii Google and Amazon model, 127–128 sabermetrics and, xvi–xvii shipping fee error, 14 Best Buy, 198 analysis, 77–79 Bezos, Jeff, 19–20 A/B testing, 82–84 Bhambhri, Anjul, 17 data visualization, 84–86 BI (business intelligence), 68–70 heat maps, 86–87 Big Brother, 187 Tableau software, 85 Big Data time series analysis, 87–88 acceptance gains within Visually software, 85 organization, 171 predictive analytics, 100–102, 136–137 bad data and, 162–163 LDMU (Law of Diminishing capabilities, 79–80 Marginal Utility), 103 characteristics, 50–52 LLN (Law of Large Numbers), checklist, 177 102–103 community knowledge, 173 regression analysis, 80–82 as complement, 56–57 sentiment analysis, 97–98 completeness, 65–68 text analytics, 95–97 conferences and, 173 Anderson, Chris consumers, 63–64 Free: The Future of a Radical Price, 14 current presence, 50–51 “Tech Is Too Cheap to Meter,” 14 data model evolution, 172–173 Angry Birds, 62 definition, 49–50 Apache COPYRIGHTEDdynamism, MATERIAL 62 Cassandra, 124 evolution, 201–203 Hadoop (See Hadoop) experiments, 169–171 Apple, Big Data statistics, 99 fragmentation, 52–54 appliances, 211 goal setting, 166–167 applications, 114 goals, 178 AppStore (Apple), 20 Google Trends, 51 automation, 88 as initiative, 175–176 machine learning, 89–90 IT and, 177–178 nanotechnology, 90–91 iterativeness, 173–174 225 bindex.indd 225 2/15/2013 4:00:09 PM 226 INDEX Big Data (continued) Cognos, 68 limitations, 105–106 collaborative filtering, 104–105 market size, 10–12 columnar databases, 125–127 naysayers, 22 complementary role of Big Data, 56–57 network effects, 174 completeness of Big Data, 65–68 Obama re-election and, 51–52 composition of data, 39–40 pitfalls, 174–180 consumers precision, 59–61 Big Data revolution and, 12–13, predictions and, 57 63–64 revolution, 12 consumer fatigue, 189–191 consumers and, 12–13 costs platforms and, 19–20 Amazon fee shipping error, 14 social media, 21–22 data storage, 14 technology costs, 14–15 COTS (commercial off the shelf) startup, 165–167 system, 32 timing, 24–25 crowdsourcing, recommendation training for, 168–169 engines, 58 unpredictability, 62 The Cult of the Amateur (Keen), 206–207 venture capitalists and, 132–133 vision, 171–172 D Big Data: The Next Frontier for Innovation, data. See also Big Data Competition, and Productivity, born digital, 40 77–78 composition, 39–40 The Big Short (Lewis), xvi versus heuristics, 23–24 BigQuery (Google), 129 limitations, 218 BigTable (Google), 129 metadata, 29 Black Swan (Taleb), 60–61 poly-structured, 35 Bricklin, Dan, 30 versus rules of thumb, 23–24 BrightContext, 98 science risks, 14–17 Brin, Sergei, 19–20 scientists, 14–17 bucket testing. See A/B testing problem solving, 179–180 Burry, Dr. Michael, 60 semi-structured, 35–36 business readiness, 163–164 structured data, 30–33 BYOD (bring your own device), 63 transactional data, 31 unstructured, 35 C The Data Asset: How Smart Companies CapitalOne, A/B testing, 82–83 Govern Their Data for Business car insurance, 2–5 Success (Fisher), 41–42 Progressive, Snapshot, 3–4 Data Deluge, 29 Carnegie Mellon University, Very Laney, Douglas, 49–50 Large Information Systems variety, 50 course, 168 velocity, 50 case studies volume, 50 Explorys, 147–152 data disconnect, 44–45 NASA, 152–158 Data Driven: Profiting from Your Most Quantcast, 141–146 Important Business Asset Cassandra, 124 (Redman), 178 citizen journalists, 64 data dysfunction, 41–42 Cloudera, 117 data governance, 64 Kornacker, Marcel, 117 data management CloudFlare, machine learning and, 89 as continuum, 55–56 bindex.indd 226 2/15/2013 4:00:09 PM INDEX 227 data governance, 64 Feldman, Konrad, 142 MDM (master data management), filtering, collaborative, 104–105 64 Fisher, Tony, The Data Asset: How Smart data mining, 69–70 Companies Govern Their Data for Data Mining 2.0, 70 Business Success, 41–42 data model evolution, 172–173 Flickr data storage costs, 14 Rush 2012 Las Vegas Photos, 37 data theft, employees, 63 searches, 37–38 data visualization, 84–86 fragmentation of Big Data, 52–54 heat maps, 86–87 Frankston, Bob, 30 Tableau software, 85 Free: The Future of a Radical Price time series analysis, 87–88 (Anderson), 14 Visually software, 85 Friedman, Thomas L., That Used to Be DataFlux, 41–42 Us: How American Fell Behind in De Goes, John, 131 the World It Invented and How We Deep Web, 66 Can Come Back, 213 digital, birth of data, 40 Fruition Sciences, 52 DLF (Data Liberation Front), 204 Dremel (Google), 129 G Dunbar’s Number, 54 gamification, 7 dynamism of Big Data, 62 goal setting, 166–167 dysfunctional data, 41–42 Goldbloom, Anthony, 130 Google, 17–19 E Amazon model and, 127–128 EDI (Electronic Data Interchange), 36 Big Data statistics, 99 EHRs (electronic health records), Big Data tools, 129 93–95 BigQuery, 129 e-mail, semi-structured data, 36 BigTable, 129 employees Dremel, 129 data theft, 63 Facebook information and, 67 knowledge workers, 191–194 machine learning and, 89 employers, BYOD (bring your own MapReduce, 129 device), 63 Safari browser, 184–185 Enterprise Miner (SAS), 70 Street View, 183–184 Epstein, Theo, xviii Trends, Big Data, 51 ERD (Entity Relationship Diagram), 32 Great Recession, 60 ERP (enterprise resource planning) Grid Engine, 134 systems, 30–31 Grimes, Seth, 97 ETL (extract, transform, and load), 32 exabytes, 11 H Exif (Exchangeable Image File) data, 37 Hack, Martin, 89 experimenting with Big Data, 169–171 Hadapt, 119 Explorys case study, 147–152 Hadoop, 114 Cloudera, 116, 117 F enterprise vendors, 120–121 Facebook Facebook and, 114 Big Data statistics, 99 Hadoop: The Definitive Guide Google and, 67 (White), 115 Hadoop and, 114 HBase, 115–116 machine learning and, 89 HDFS (Hadoop Distributed File fail whale, 170 System), 115 bindex.indd 227 2/15/2013 4:00:09 PM 228 INDEX Hadoop (continued) J Hive, 116 Jain, Anil, 147 Hortonworks, 118–119 James, Bill, sabermetrics and, xviii limitations, 121 Johnson, Stephen, Where Good Ideas MapR, 118–119 Come From: The Natural History of MapReduce, 115 Innovation, 201 origins, 115 Pig, 116 K Splunk, 118–119 Kaggle, 129–131 start-ups, Hadoop-based, 119–120 Kahler, Scott, 115 Talend, 116 Keen, Andrew, The Cult of the Amateur, Hammerbacher, Jeff, 16 206–207 hardware considerations, 133–136 Kelley, Kevin, xviii Harris, Jim, 33 knowledge bases, 33 Hastings, Reed, 58–59 knowledge workers, 191–194 HBase, 115–116 Kornacker, Marcel, Cloudera, 117 HDFS (Hadoop Distributed File Kryder’s Law, 14 System), 115 heat maps, 86–87 L heuristics, versus data, 23–24 Laney, Douglas, 18 high elasticity of demand, 209 Big Data definition, 49–50 Hortonworks, 118–119 LDMU (Law of Diminishing Marginal housing market, 60 Utility), 103 Howard, Jeremy, 130 The Lean Startup (Reis), 83–84 Howe, Art, xvii–xviii Lewis, Michael, xv–xvi HR (Human Resources), data use, 8–9 The Big Short, xvi Huffington Post, machine learning and, Moneyball: The Art of Winning an 89–90 Unfair Game, xv–xvi Hurricane Sandy, 20 limitations of data, 218 Hyperion, 68 LLN (Law of Large Numbers), 102–103 Lougheed, Charlie, 147 I Lytro, photo focus, 39 IA (Information Access), 96 IM (information management), M employees and, 41 machine learning, 89–90 infonomics, 17–19, 162 Mandelbaum, Michael, That Used to Be information size Us: How American Fell Behind in exabytes, 11 the World It Invented and How We zettabytes, 11 Can Come Back, 213 Infosphere BigInsights, 120 MapR, 118–119 Instagram, 62 MapReduce (Google), 115, 129 insurance, car insurance, 2–5 mass market, 53 Progressive, 3–4 Mauboussin, Michael J., The Success Intelligent Miner, 70 Equation: Untangling Skill and Internet of Things, 207 Luck in Business, Sports, and intranets, 33 Investing, 194 IOD (Information on Demand) McHale, Stephen, 147 conference, xv McKinsey, Big Data: The Next Frontier iPhone, AppStore, 20 for Innovation, Competition, and IR (Information Retrieval), 96 Productivity, 77–78 IT and Big Data, 177–178 McKnight, William, 126–127 bindex.indd 228 2/15/2013 4:00:09 PM INDEX 229 MDM (master data management), 64 platforms Meil, Doug, 147 AppStore (Apple), 20 Menino, Thomas, 6 Big Data Revolution and, 19–20 urban mechanics, 212–213 Pole, Andrew, 198–199 metadata, 29, 36–39 poly-structured data, 35 Netflix, 58–59 pothole reporting, 5–8 photos, 36–39 Precog, 131 Rush 2012 Las Vegas Photos, 36 predictions metapredictions, 211 Big Data and, 57 Microsoft Analysis Services, 70 metapredictions, 211 MicroStrategy, 68 pregnancy, 198–200 Mitchell, Tom, 89 Silver, Nate, 20–21, 218–219 Moneyball: The Art of Winning an Unfair predictive analytics, 100–102, Game (Lewis), xv–xvi 136–137 Morey, Daryl, sabermetrics and, xviii LLN (Law of Large Numbers), MRP (manufacturing resource 102–103 planning) systems, 30–31 pregnancy predictions, 198–200 privacy, 184–188 N security issues, 188–189 nanotechnology, automation and, problem solving like data scientists, 90–91 179–180 NASA case study, 152–158 Progressive insurance, Snapshot, Nest Thermostat, 209–210 3–4 Netflix, 58–59 Qwikster, 59 Q New Urban Mechanics, 212–213 Quantcast case study, 141–146 NewSQL, 124–125 querying data, JOIN statements, 32 NFC (near field communication),