{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# HW3. More SQL "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Objectives"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this assignment, you will write more complex SQL queries to query a database. \n",
    " - How to use `Order By` to sort data\n",
    " - How to use `Set Operators` to union/intersect multiple tables\n",
    " - How to use `Join Opeartor` to join multiple tables\n",
    " - How to use `Aggregations` and `Group By` to aggregate data\n",
    " - How to write `subqueries` in SQL "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Background"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Suppose you work at a bank as a data analyst. Your main job is to analyze the data stored in their database. Please download the database at this [link](bank.db). \n",
    "\n",
    "<sup></sup>\n",
    "\n",
    "The database has five tables. The following shows their schemas. Primary key attributes are underlined and foreign keys are noted in superscript.\n",
    " - Customer = {<span style=\"text-decoration:underline\">customerID</span>, firstName, lastName, income, birthDate}\n",
    " - Account = {<span style=\"text-decoration:underline\">accNumber</span>, type, balance, branchNumber<sup>FK-Branch</sup>}\n",
    " - Owns = {<span style=\"text-decoration:underline\">customerID</span><sup>FK-Customer</sup>, <span style=\"text-decoration:underline\">accNumber</span><sup>FK-Account</sup>}\n",
    " - Transactions = {<span style=\"text-decoration:underline\">transNumber</span>, <span style=\"text-decoration:underline\">accNumber</span><sup>FK-Account</sup>, amount}\n",
    " - Employee = {<span style=\"text-decoration:underline\">sin</span>, firstName, lastName, salary, branchNumber<sup>FK-Branch</sup>}\n",
    " - Branch = {<span style=\"text-decoration:underline\">branchNumber</span>, branchName, managerSIN<sup>FK-Employee</sup>, budget}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Notes**\n",
    " - The *customerID* attribute (*Customer*) is a unique number that represents a customer, it is *not* a customer's SIN\n",
    " - The *accNumber* attribute (*Account*) represents the account number\n",
    " - The *balance* (*Account*) attribute represents the total amount in an account\n",
    " - The *type* (*Account*) attribute represents the type an account: chequing, saving, or business\n",
    " - The *Owns* relation represents a many-to-many relationship (between *Customer* and *Account*)\n",
    " - The *transNumber* attribute (*Transactions*) represents a transaction number, combined with account number it uniquely identify a transaction\n",
    " - The *branchNumber* attribute (*Customer*) uniquely identifies a branch\n",
    " - The *managerSIN* attribute (*Customer*) represents the SIN of the branch manager"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Questions （1 point per question)\n",
    "\n",
    "Write SQL queries to return the data specified in questions 1 to 20.\n",
    "\n",
    "**Query Requirement**\n",
    " - The answer to each question should be a single SQL query\n",
    " - You must order each query as described in the question, order is always ascending unless specified otherwise\n",
    " - Every column in the result should be named, so if the query asks you to return something like income minus salary make sure that you include an AS statement to name the column\n",
    " - While your queries will not be assessed on their efficiency, marks may be deducted if unnecessary tables are included in the query (for example including Owns and Customer when you only require the customerID of customers who own accounts)\n",
    "\n",
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Execute the next two cells**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext sql"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%sql sqlite:///bank.db"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Queries**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**1.** *First name, last name, income* of customers whose income is within [60,000, 70,000], order by *income* (desc), *lastName*, *firstName*."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**2.** *SIN, branch name, salary and manager’s salary - salary* (that is, the salary of the employee’s manager minus salary of the employee) of all employees in New York, London or Berlin, order by ascending (manager salary - salary)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**3.** *First name, last name, and income* of customers whose income is at least twice the income of any customer whose lastName is Butler, order by last name then first name. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**4.** *Customer ID, income, account numbers and branch numbers* of customers with income greater than 90,000 who own an account at both London and Latveria branches, order by customer ID then account number. The result should contain all the account numbers of customers who meet the criteria, even if the account itself is not held at London or Latveria."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**5.** *Customer ID, types, account numbers and balances* of business (type *BUS*) and savings (type *SAV*) accounts owned by customers who own at least one business account or at least one savings account, order by customer ID, then type, then account number."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**6.** *Branch name, account number and balance* of accounts with balances greater than $110,000 held at the branch managed by Phillip Edwards, order by account number."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**7.** Customer ID of customers who have an account at the *New York* branch, who do not own an account at the London branch and who do not co-own an account with another customer who owns an account at the *London* branch, order by customer ID. The result should not contain duplicate customer IDs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**8.** *SIN, first name, last name, and salary* of employees who earn more than $70,000, if they are managers show the branch name of their branch in a fifth column (which should be NULL/NONE for most employees), order by branch name. You must use an outer join in your solution (which is the easiest way to do it)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**9.** Exactly as question eight, except that your query cannot include any join operation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**10.** *Customer ID, first name, last name and income* of customers who have income greater than 5000 and own accounts in all of the branches that *Helen Morgan* owns accounts in, order by income in descreasing order."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**11.**  *SIN, first name, last name and salary* of the lowest paid employee (or employees) of the *London* branch, order by sin."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**12.**  *Branch name, and the difference of maximum and minimum (salary gap) and average salary* of the employees at each branch, order by branch name."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**13.**  *Count* of the number of employees working at the *New York* branch and *Count* of the number of different last names of employees working at the *New York* branch (two numbers in a single row)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**14.** *Sum* of the employee salaries (a single number) at the *New York* branch."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**15.**  *Customer ID, first name and last name* of customers who own accounts at a max of four different branches, order by Last Name and first Name."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**16.**  *Average income* of customers older than 60 and average income of customers younger than 20, the result must have two named columns, with one row, in one result set (hint: look up [SQLite time and date functions](https://www.sqlite.org/lang_datefunc.html))."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**17.**  *Customer ID, last name, first name, income, and average account balance* of customers who have at least three accounts, and whose last names begin with *S* and contain an *e* (e.g. **S**t**e**ve) **or** whose first names begin with *A* and have the letter *n* just before the last 2 letters (e.g. **An**ne), order by customer ID. Note that to appear in the result customers must have at least 2 accounts and satisfy one (or both) of the name conditions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**18.**  *Account number, balance, sum of transaction amounts, and balance - transaction sum* for accounts in the *London* branch that have at least 15 transactions, order by transaction sum."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**19.**  *Branch name, account type, and average transaction amount* of each account type for each branch for branches that have at least 50 accounts of any type, order by branch name, then account type."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**20.**  *Branch name, account type, account number, transaction number and amount* of transactions of accounts where the average transaction amount is greater than three times the (overall) average transaction amount of accounts of that type. For example, if the average transaction amount of all business accounts is 2,000 then return transactions from business accounts where the average transaction amount for that account is greater than 6,000. Order by branch name, then account type, account number and finally transaction number. Note that all transactions of qualifying accounts should be returned even if they are less than the average amount of the account type."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#REPLACE WITH YOUR CODE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Submission"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Complete the code in this notebook [hw3.ipynb](hw3.ipynb). Put `hw3.ipynb` and `bank.db` into hw3.zip and submit it to the Canvas activity Assignment 3."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
