Pentaho Business Analytics Platform
Ingest, Prepare, Blend and Analyze
Figure 1. Pentaho Business Analytics Platform
Power to the
User – Any User
Business and IT
Enterprise Governance
and Scalability
Easily Embeddable
Into Enterprise
Insight Into All Data
(Big and Diverse Data)
Public/Private Clouds
Operational Data
Big Data Data Stream
100% Java Open Web-Based APIs Pluggable Architecture
Data Integration
Business Analytics
Ad Hoc Analysis
By tightly coupling data integration with business analytics, the Pentaho platform from
Hitachi Vantara brings together IT and business users to ingest, prepare, blend and
analyze all data that impacts business results. Pentaho’s open source heritage drives
continued innovation in a modern, unified, flexible analytics platform that helps organi-
zations accelerate their analytics data pipelines.
Pentaho’s open, embeddable platform (see Figure 1) supports
flexible analytics that both leverage existing data infrastructure
and future-proof deployments against tomorrow’s inevitable
changes. Intuitive data integration and preparation capabilities
drastically reduce the hand coding required to bring data together
for insight. At the same time, Pentaho Business Analytics pro-
vides a spectrum of analytics for all user roles, from visual data
analysis for business analysts to tailored dashboards for execu-
tives. Pentaho is fast to deploy, easy to use, and purpose-built for
the future of analytics.
Data Integration
Organizations face an increasing challenge to manage and extract
value from a growing variety and volume of data across their
edge-to-cloud infrastructure. With Pentaho Data Integration (PDI),
organizations can access data from complex and heterogeneous
sources and blend it with existing relational data to produce
high-quality, ready-to-analyze information — all without writing a
line of code. A rich graphical user interface paired with a powerful
multithreaded transformation engine offers high-performance ETL
(extract, transform and load) capabilities that cover all data integra-
tion needs, including big data ingestion and processing.
Pentaho Data Integration Features
Intuitive drag-and-drop interface to simplify the creation of analytic
data pipelines (see Figure 2).
Broad connectivity to virtually any data source, either on premises
or in the cloud, including flat files, relational database management
systems (RDBMS), APIs and more
Integration with transactional databases, including Oracle, IBM
, PostgreSQL, MySQL and others.
Access to data in enterprise applications, including SAP,, Google Analytics and more.
Rich library of prebuilt components to access, prepare, blend and
cleanse data.
Direct access to complete analytics, including charts, visualiza-
tions and reporting from any step of PDI.
Robust orchestration capabilities to coordinate complex work-
flows, including scheduling and alerts.
Integration of advanced analytic models from R, Python, Scala and
Weka that incorporate libraries, such as scikit-learn, Spark MLlib,
Tensorflow and Keras, into the data flow.
Enterprise-grade administration, scalability, load balancing, container-
ization* and security capabilities.
Big Data
The Pentaho platform enables companies to realize business value
from large volumes of diverse data by dramatically reducing the
time and complexity required to design, develop and deploy big
data analytics. Pentaho covers the entire big data life cycle, from
data extraction and preparation of diverse data, to scalable pro-
cessing on Spark and Hadoop, leading to end-to-end analytics
Pentaho Is the Leading Solution for Big Data Integration and
Visual design environment for blending multiple big data sources
(see Figure 3) and processing data at scale.
Integration with leading Hadoop distributions, object stores,
NoSQL stores and analytic databases, as well as log file data and
JSON/XML formats.
Code-free data transformation design that empowers 15 times
faster productivity versus hand coding and executes Spark or
Hadoop jobs in clusters for high performance.
Operationalize with Spark stream and batch job execution, SQL
on Spark connectivity, Kafka access and more.
Figure 2. Drag-and-Drop Data Transformation in Pentaho Data Integration
Figure 3. Variety of Big Data Sources Supported by Pentaho
“Using Pentaho, we are now helping clients blend a 360-
degree view of all equipment data sources to enable early
prediction of potential machinery failure.”
Caterpillar Marine Asset Intelligence
Seamlessly switch between execution engines such as Spark and
Pentaho’s native engine to fit data volume, velocity and transfor-
mation complexity.
Template-based approach to rapidly onboard data sources into
Hadoop via metadata injection feature set.
Adaptive big data layer that enables smooth portability of transfor-
mations across different Hadoop and Spark distributions.
Seamlessly scale data transformations with containerized
Pragmatic solutions to deliver on-demand data marts in a big data
Multicloud Support
Extend the benefits of the open, extensible Pentaho platform to
address a wide range of needs in multi, hybrid and private cloud
deployments. Pentaho’s modern data architecture simplifies man-
agement of your increasingly distributed data architecture with a
single data management tool.
End-to-End Data Platform
Connectivity to cloud storage and computing in AWS, Google
Cloud and Microsoft Azure.
Support for popular cloud data warehouses, including Amazon
Redshift, Snowflake and Google BigQuery.
Big data processing in Amazon EMR and HDInsight environments.
Filtering and contextual analysis of streaming data in AWS Kinesis
and Kafka.
Bulk loading for popular cloud data warehouses, including
Amazon Redshift and Snowkflake.
Business Analytics
Pentaho Business Analytics provides a modern, highly interactive,
and intuitive web-based interface for business users to discover
explore virtually any data. With a full spectrum of analytics tools,
users can create reports and dashboards as well as visualize and
analyze data across multiple dimensions without dependence on IT
or developers. Meanwhile, IT benefits from secure, scalable and gov-
erned analytics for the whole enterprise. Pentaho can be deployed
on premises or in the cloud and can be seamlessly embedded into
other software applications. Pentaho Business Analytics provides:
Ad Hoc Analysis and Visualization
A rich library of interactive visualizations such as geographic maps,
heat grids, bubble charts and more (see Figure 4).
Extreme scale of in-memory data caching for speed-of-thought
analysis on large data volumes using a drag-and-drop paradigm.
Visual lasso filtering and zooming to understand or exclude
Attribute highlighting for better visual contrast among data
The ability to drill down into supporting reports for detailed data.
Flexible Dashboards
Web-based drag-and-drop dashboard designer for business
Portal and mashup integration to seamlessly integrate business
analytics with other web applications.
Rich visualizations with navigation, drill-down capabilities and a
library of filter controls.
Advanced dashboard framework for 100% tailored user
User-Driven Reporting
Full support for operational reports, parameterized reports, and
interactive self-service reporting against transactional databases.
Intuitive web-based interactive reporting for business users.
Rich graphical pixel-perfect report designer for power users.
Mobile Business Analytics
Ability to access data discovery, interactive analysis and visualiza-
tion on mobile devices.
Optimized mobile experience with native gestures, such as touch filter-
ing, drill through and touch-enabled drag and drop.
Figure 4. Heat Grid Analysis in Pentaho Business Analytics
HITACHI and Lumada are trademarks or registered trademarks of Hitachi, Ltd. Pentaho is a trademark or registered trademark of Hitachi Vantara LLC. IBM and DB2 are trademarks or
registered trademarks of International Business Machines Corporation. Microsoft and Azure are trademarks or registered trademarks of Microsoft Corporation. All other trademarks, service
marks, and company names are properties of their respective owners.
P-005-F BTD March 2020
Hitachi Vantara
Corporate Headquarters
2535 Augustine Drive
Santa Clara, CA 95054 USA |
Contact Information
USA: 1-800-446-0744
Global: 1-858-547-4526
Embedded Analytics
Pentaho’s flexible cloud-ready platform is purpose-built for
embedding into and integrating with your applications, portals
and processes. Our powerful analytics and extensible architecture
ensure that you can get to market quickly and delight your custom-
ers. Our embedded analytics solution offers:
The ability to seamlessly embed real-time visualizations reports
and dashboards into existing applications. (See Figures 5 and 6.)
Highly customizable web-based user interface and robust web
APIs that offer maximum control over the look, feel and user
Flexible capabilities for multitenant deployment as well as rich
single sign-on and security integration.
Tailored training and access to architect-level staff who have
made hundreds of organizations successful.
Any Analytics, Any Data, Simplified
Pentaho addresses the barriers that block organizations from get-
ting value from all of their data. Our platform simplifies the process
of preparing and blending any data and includes a spectrum of
tools to easily analyze, visualize, explore, report and predict. Open,
embeddable and extensible, Pentaho is architected to ensure that
each member of your team — from developers to business users
— can easily translate data into value.
Figure 5. Intuitive Dashboard in Pentaho Business Analytics
Figure 6. Pentaho Analytics Capability Embedded Into a Web Application
“After reviewing ve different proprietary and open source platforms, Pentaho emerged as
the best. I believe it is a future-proof solution that will help us to ensure data governance,
‘one version of the truth,’ and a great user experience.”
CERN, the European Organization for Nuclear Research
* 9.0 Beta Feature