COPYRIGHTED MATERIAL61 Process Actions And, 85–89 Pentaho Administrative Console
Total Page:16
File Type:pdf, Size:1020Kb
Index A Active@ ISO Burner, 22 absolute positioning, PRD, 398 Ad Hoc Report Wizard, 373–375 access control Add Cube button, 466–467 metadata layer refining privileges, 349 Add Job process action, Scheduler, schema design and, 483 418–419 accumulating periodic snapshot fact table, Add Levels, 474–476 149–150 Add Parameter function, PRD, 386–389 action definitions, 68 Add Sequence step, PDI action sequence editor, 80–82 loading date dimension, 268–269 Action Sequence Wizard, 80, 86 loading demography dimension, 283, 284 action sequences additive measures, 150 Addresses tab page, Mail job, 290 adding as templates for Design Studio, 88 Ad-Hoc Report component, 192 creating with PDS. See PDS (Pentaho admin user Design Studio) creating slave servers, 340 Customers per Website pie chart, managing PDI repository accounts, 548–551 326–327 executed by solution engine, 68 PDI repository, 324 executing in background, 422–423 administration-console, 38, 44 functionality of, 78 administrative tasks inputs for, 83–85 data sources, 60–61 for location data, 559–561 managing schedules and subscriptions, outputs for, 85 COPYRIGHTED MATERIAL61 process actions and, 85–89 Pentaho Administrative Console. See programming Scheduler with, 412, PAC (Pentaho Administrative 417–420 Console) running jobs within, 335 user management, 58–60 solution repository containing, 68 Administrator profile, PDI repository, 327 subscribing to, 423–426 Advanced category, Database Connection using transformations in, 334–336 dialog, 250 actions, process, 85–89 Advisor button, PAD, 501 Active Directory (AD), and EE single AgeandAgeSequencestep,demography sign-on, 77 dimensions, 282 571 572 Index ■ A Age Group step, demography dimensions, ARFF (Attribute Relation File Format), 282, 284–285 511, 519 Aggregate Designer, 130, 442 AS. See action sequences aggregate tables ASPs (Application Service Providers), creating manually, 500 144 drawbacks of, 502 assignments, user management, 59 extending Mondrian with, 497–500 association, as data mining tool, 507–508 generating and populating, 445 at utility, job scheduling, 421 Pentaho Analysis Services, 445 Attribute Relation File Format (ARFF), aggregation 511, 519 alternatives to, 502 attributes benefits of, 496–497 dimensions, 476–477 data integration process, 229 global data mart data model, 206–207 data warehouse design, 163–164 hierarchies, 472–473 data warehouse performance, 130 level, 474–475 Mondrian, 496 measures, 469–470 PRD reports, 393–395 Mondrian cubes, 467 restricting results of, 157 not fitting into dimension tables, Slice and Dice pivot table example of, 180–181 17–18 audit columns, 163–164 using subreports for different, 404–406 authentication WAQR reports, 374 hibernate database storing data on, AJAX technology, CDF dashboards, 45, 47 529–530 JDBC security configuration, 50 algorithms, as data mining tool, 508–509 Mail job entry configuration, 290–291 aliases, 152–153, 384–385 Pentaho Administrative Console all level, MDX hierarchies, 450–451 configuration, 57–58 all member, MDX hierarchies, 450 slave server configuration, 340 All Schedules panel, server admin’s SMTP configuration, 53–54 workspace, 428 Alves, Pedro, 530 Spring Security handling, 60, 69 analysis authorization hibernate examples of, 16–19 database storing data on user, views in Pentaho BI Server, 484–485 47, 60 views in user console, 73–74 JDBC security configuration, 50 analytical databases, 142–143 managing user accounts, 327–328 analytics, business, 503 Spring Security handling, 60, 69 AND operator, multiple restrictions, 157 user configuration, 58–60 appliances, data warehouse, 143–144 automated knowledge discovery, 503. Application Service Providers (ASPs), 144 See also data mining architecture automatic startup, UNIX/Linux systems, Community Dashboard Framework, 40–41 532–534 availability, remote execution, 338–339 data warehouse. See data warehousing averages, calculating, 217–218 architecture axes Pentaho Analysis Services, 442–444 controlling dimension placement on, Pentaho BI, 64 489–490 reporting, 371–373 dimensions on only one axis, 455 archiving MDX representing information on data warehouse performance and, 132 multiple axes, 453 transaction data, 128 Azzurri Clay, 32 Index ■ B–C 573 B bridge tables back office maintenance of, 229 data warehouse architecture, 117 multi-valued dimensions and, 182–183 database support for, 95–96 navigating hierarchies using, 184–185 back-end programs, Pentaho BI stack, 66 browsers, logging in, 6–7 background directory, content repository, Building the Data Warehouse (Inmon), 413 113–114 background execution, 422–423, 426–429 built-in variables, 314 backup, PDI repository, 329 bum, managing Linux init scripts, 42 banded report tools, 376 Burst Sales Report, 54 bar charts, 400–402 bursting Base concept, metadata layer, 357 defined, 430 batch-wise ETL, 118 implementing in Pentaho, 430 BC (business case), 192. See also World other implementations, 438 Class Movies overview of, 430 BDW (Business Data Warehouse), 111, 115 rental reminder e-mails example, BETWEEN...AND operators, 151–152 430–438 BI (business intelligence). See also Pentaho bus architecture, 179–189 BI Server business analysts, 193–195 analytics and, 503 business analytics, 503 business case (BC), 192. See also WCM components, 70–73 (World Class Movies), example dashboards and, 529 business case data mart design and. See data marts, Business Columns, at logical layer, design 362–363 data mining and, 505–506 Business Data Warehouse (BDW), 111, 115 definition of, 107 business intelligence. See BI (business example business case. See WCM (World intelligence) Class Movies), example business Business Intelligence server. See Pentaho case BI Server importance of data, 108–109 business layer, metadata model, 71 platform, Pentaho BI stack, 64 business modeling, using star schemas. See purpose of, 105–109 star schemas real-time data warehousing and, 140–142 Business Models, logical layer, 362 reports and. See reports business rules engines, 141 using master data management, 127–128 Business Tables, 362–365 BI Developer examples Business Views, 71, 362 button-single-parameter.prpt,13–14 button-single-parameter.prpt sample, CDF section, 530 13–14 overview of, 8–9 Regional Sales - HTML reporting, 11–12 Regional Sales - Line/Bar Chart, 16–17 C Slice and Dice Analysis, 17–18 c3p0 connection pool, 49–50 BIRT reports, 72, 372 C4.5 decision tree algorithm, 508, 512 biserver-ce, 38. See also Pentaho home C5.0 decision tree algorithm, 508–509 directory caching, Join Rows (Cartesian product) bitmap indexes, and data warehousing, step, 280 129–130 Calculate and Format Dates step, 269–273 blob field, reports with images, 401–403 Calculate Time step, 277, 281 BOTTOMCOUNT function, MDX queries, calculated members, 459–460, 483 457 calculations, PRD functions for, 395 574 Index ■ C Calculator step, PDI summary, 569–570 age and income groups for demography synergy with community and Pentaho dimension, 285 corporation, 529–530 Calculate Time step, 281 templates, 538 current and last year indicators, 276 .xcdf file, 537–538 loading date dimension, 269–273 central data warehouse, 117, 119–121 loading demography dimension, 282 CEP (complex event processing), 141 calendar days, calculating date Change Data Capture. See CDC (Change dimensions, 270 Data Capture) Carte server CHAPTERS, axes, 453 clustering, 341–342 Chart editor, 398–399 creating slave servers, 340–341 charts. See also Customers per Website pie as PDI tool, 231 chart remote execution and, 337–339, 341 adding bar charts to PRD reports, 400 running, 339–340 adding images to PRD reports, 401–404 carte.bat script, 339–340 adding pie charts to PRD reports, carte.sh script, 339–340 400–402 Cartesian product, 153 adding to PRD reports, 397–400 Cascading Style Sheets. See CSS bursting. See bursting (Cascading Style Sheets) examples, 14–16 catalog, connecting DataCleaner with, 202 including on dashboards, 548 Catalog of OMG Modeling and Metadata JPivot, 494–496 Specifications, 352 not available in WAQR, 375 CategorySet collector function, 399–400 reacting to mouse clicks on pie, 554–555 causality, vs. correlation in data mining, Check if Staging Table Exists step, 293–294 508 child objects, and inheritance, 356 CDC (Change Data Capture), 133–137 Citrus, 13–14 choosing option, 137 class path, Weka data mining, 512–515 intrusive and non-intrusive processes, classification 133 algorithms, 508–509 log-based, 136–137 as data mining tool, 506–507 methods of implementing, 226 with Weka Explorer, 524 overview of, 133 client, defined, 65 snapshot-based, 135–136 cloud computing, 144 source data-based, 133–134 clustering trigger-based, 134–135 as data mining tool, 507 CDF (Community Dashboard database connection options for, 250 Framework), 542–569 remote execution with Carte using, 337, concepts and architecture, 532–534 341–342 content template, 541–542 snowflaking and, 186–187 customer and websites dashboard. See collapsed dimensions, 498 customer and websites dashboard collector functions, 397, 399–400 dashboarding examples, 19–20 colors, for PRD reports, 390–391 document template, 538–541 column profiling, 197–198, 199 history of, 530–531 columnar databases, 142–143 home directory, 534–535 column-oriented databases, 502 JavaScript and CSS resources, 536–537 columns overview of, 529 date dimension, 213–216 plug-in, 534 meaningful names for, 163 plugin.xml file, 535–536 quickly opening properties for, 210 skills and technologies for, 531–532 SCD type 3, 174 Index ■ C 575 using dictionary for dependency checks content template, CDF, 533, 541–542