Analyzing Big Data with SQL

Description

In this course, you’ll get an in-depth look at the SQL SELECT statement and its main clauses. The course focuses on big data SQL engines Apache Hive and Apache Impala, but most of the information is applicable to SQL with traditional RDBMs as well; the instructor explicitly addresses differences for MySQL and PostgreSQL.

By the end of the course, you will be able to
• explore and navigate databases and tables using different tools;
• understand the basics of SELECT statements;
• understand how and why to filter results;
• explore grouping and aggregation to answer analytic questions;
• work with sorting and limiting results; and
• combine multiple tables in different ways.
To use the hands-on environment for this course, you need to download and install a virtual machine and the software on which to run it. Before continuing, be sure that you have access to a computer that meets the following hardware and software requirements:
• Windows, macOS, or Linux operating system (iPads and Android tablets will not work)
• 64-bit operating system (32-bit operating systems will not work)
• 8 GB RAM or more
• 25GB free disk space or more
• Intel VT-x or AMD-V virtualization support enabled (on Mac computers with Intel processors, this is always enabled;
on Windows and Linux computers, you might need to enable it in the BIOS)
• For Windows XP computers only: You must have an unzip utility such as 7-Zip or WinZip installed (Windows XP’s built-in unzip utility will not work)

What you will learn

Orientation to SQL on Big Data
SQL SELECT Essentials
Filtering Data
Grouping and Aggregating Data

What’s included