Superset Documentation
Total Page:16
File Type:pdf, Size:1020Kb
Superset Documentation Apache Superset Dev Dec 05, 2019 CONTENTS 1 Superset Resources 3 2 Apache Software Foundation Resources5 3 Overview 7 3.1 Features..................................................7 3.2 Databases.................................................7 3.3 Screenshots................................................9 3.4 Contents................................................. 12 3.5 Indices and tables............................................ 85 i ii Superset Documentation Apache Superset (incubating) is a modern, enterprise-ready business intelligence web application Important: Disclaimer: Apache Superset is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. Note: Apache Superset, Superset, Apache, the Apache feather logo, and the Apache Superset project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries. CONTENTS 1 Superset Documentation 2 CONTENTS CHAPTER ONE SUPERSET RESOURCES • Superset’s Github, note that we use Github for issue tracking • Superset’s contribution guidelines and code of conduct on Github. • Our mailing list archives. To subscribe, send an email to [email protected] • Join our Slack 3 Superset Documentation 4 Chapter 1. Superset Resources CHAPTER TWO APACHE SOFTWARE FOUNDATION RESOURCES • The Apache Software Foundation Website • Current Events • License • Thanks to the ASF’s sponsors • Sponsor Apache! 5 Superset Documentation 6 Chapter 2. Apache Software Foundation Resources CHAPTER THREE OVERVIEW 3.1 Features • A rich set of data visualizations • An easy-to-use interface for exploring and visualizing data • Create and share dashboards • Enterprise-ready authentication with integration with major authentication providers (database, OpenID, LDAP, OAuth & REMOTE_USER through Flask AppBuilder) • An extensible, high-granularity security/permission model allowing intricate rules on who can access individual features and the dataset • A simple semantic layer, allowing users to control how data sources are displayed in the UI by defining which fields should show up in which drop-down and which aggregation and function metrics are made available to the user • Integration with most SQL-speaking RDBMS through SQLAlchemy • Deep integration with Druid.io 3.2 Databases The following RDBMS are currently suppored: • Amazon Athena • Amazon Redshift • Apache Drill • Apache Druid • Apache Hive • Apache Impala • Apache Kylin • Apache Pinot • Apache Spark SQL • BigQuery • ClickHouse 7 Superset Documentation • Google Sheets • Greenplum • IBM Db2 • MySQL • Oracle • PostgreSQL • Presto • Snowflake • SQLite • SQL Server • Teradata • Vertica Other database engines with a proper DB-API driver and SQLAlchemy dialect should be supported as well. 8 Chapter 3. Overview Superset Documentation 3.3 Screenshots 3.3. Screenshots 9 Superset Documentation 10 Chapter 3. Overview Superset Documentation 3.3. Screenshots 11 Superset Documentation 3.4 Contents 3.4.1 Installation & Configuration Getting Started Superset has deprecated support for Python 2.* and supports only ~=3.6 to take advantage of the newer Python features and reduce the burden of supporting previous versions. We run our test suite against 3.6, but running on 3.7 should work as well. Cloud-native! Superset is designed to be highly available. It is “cloud-native” as it has been designed scale out in large, distributed environments, and works well inside containers. While you can easily test drive Superset on a modest setup or simply on your laptop, there’s virtually no limit around scaling out the platform. Superset is also cloud-native in the sense that it is flexible and lets you choose your web server (Gunicorn, Nginx, Apache), your metadata database engine (MySQL, Postgres, MariaDB, . ), your message queue (Redis, RabbitMQ, SQS, . ), your results backend (S3, Redis, Memcached, . ), your caching layer (Memcached, Redis, . ), works well with services like NewRelic, StatsD and DataDog, and has the ability to run analytic workloads against most popular database technologies. Superset is battle tested in large environments with hundreds of concurrent users. Airbnb’s production environment runs inside Kubernetes and serves 600+ daily active users viewing over 100K charts a day. The Superset web server and the Superset Celery workers (optional) are stateless, so you can scale out by running on as many servers as needed. Start with Docker Note: The Docker-related files and documentation has been community-contributed and is not actively maintained and managed by the core committers working on the project. Some issues have been reported as of 2019-01. Help and contributions around Docker are welcomed! If you know docker, then you’re lucky, we have shortcut road for you to initialize development environment: git clone https://github.com/apache/incubator-superset/ cd incubator-superset/contrib/docker # prefix with SUPERSET_LOAD_EXAMPLES=yes to load examples: docker-compose run--rm superset./docker-init.sh # you can run this command everytime you need to start superset now: docker-compose up After several minutes for superset initialization to finish, you can open a browser and view http://localhost:8088 to start your journey. From there, the container server will reload on modification of the superset python and javascript source code. Don’t forget to reload the page to take the new frontend into account though. See also CONTRIBUTING.md#building, for alternative way of serving the frontend. It is also possible to run Superset in non-development mode: in the docker-compose.yml file remove the volumes needed for development and change the variable SUPERSET_ENV to production. If you are attempting to build on a Mac and it exits with 137 you need to increase your docker resources. OSX instructions: https://docs.docker.com/docker-for-mac/#advanced (Search for memory) 12 Chapter 3. Overview Superset Documentation Or if you’re curious and want to install superset from bottom up, then go ahead. See also contrib/docker/README.md OS dependencies Superset stores database connection information in its metadata database. For that purpose, we use the cryptography Python library to encrypt connection passwords. Unfortunately, this library has OS level depen- dencies. You may want to attempt the next step (“Superset installation and initialization”) and come back to this step if you encounter an error. Here’s how to install them: For Debian and Ubuntu, the following command will ensure that the required dependencies are installed: sudo apt-get install build-essential libssl-dev libffi-dev python-dev python-pip ,!libsasl2-dev libldap2-dev Ubuntu 18.04 If you have python3.6 installed alongside with python2.7, as is default on Ubuntu 18.04 LTS, run this command also: sudo apt-get install build-essential libssl-dev libffi-dev python3.6-dev python-pip ,!libsasl2-dev libldap2-dev otherwise build for cryptography fails. For Fedora and RHEL-derivatives, the following command will ensure that the required dependencies are installed: sudo yum upgrade python-setuptools sudo yum install gcc gcc-c++ libffi-devel python-devel python-pip python-wheel ,!openssl-devel libsasl2-devel openldap-devel Mac OS X If possible, you should upgrade to the latest version of OS X as issues are more likely to be resolved for that version. You will likely need the latest version of XCode available for your installed version of OS X. You should also install the XCode command line tools: xcode-select--install System python is not recommended. Homebrew’s python also ships with pip: brew install pkg-config libffi openssl python env LDFLAGS="-L$(brew --prefix openssl)/lib" CFLAGS="-I$(brew --prefix openssl)/ ,!include" pip install cryptography==2.4.2 Windows isn’t officially supported at this point, but if you want to attempt it, download get-pip.py, and run python get-pip.py which may need admin access. Then run the following: C:\> pip install cryptography # You may also have to create C:\Temp C:\> md C:\Temp 3.4. Contents 13 Superset Documentation Python virtualenv It is recommended to install Superset inside a virtualenv. Python 3 already ships virtualenv. But if it’s not installed in your environment for some reason, you can install it via the package for your operating systems, otherwise you can install from pip: pip install virtualenv You can create and activate a virtualenv by: # virtualenv is shipped in Python 3.6+ as venv instead of pyvenv. # See https://docs.python.org/3.6/library/venv.html python3-m venv venv . venv/bin/activate On windows the syntax for activating it is a bit different: venv\Scripts\activate Once you activated your virtualenv everything you are doing is confined inside the virtualenv. To exit a virtualenv just type deactivate. Python’s setup tools and pip Put all the chances on your side by getting the very latest pip and setuptools libraries.: pip install--upgrade setuptools pip Superset installation and initialization Follow these few simple steps to install Superset.: # Install superset pip install superset # Initialize the database superset db