class: center middle main-title section-title-4 # Joins and logic .class-info[ **Week 4** AEM 2850 / 5850 : R for Business Analytics<br> Cornell Dyson<br> Fall 2025 Acknowledgements: <!-- [Andrew Heiss](https://datavizm20.classes.andrewheiss.com), --> <!-- [Claus Wilke](https://wilkelab.org/SDS375/), --> [Grant McDermott](https://github.com/uo-ec607/lectures), [Jenny Bryan](https://stat545.com/join-cheatsheet.html), <!-- [Allison Horst](https://github.com/allisonhorst/stats-illustrations), --> [R4DS (2e)](https://r4ds.hadley.nz), [Garrick Aden-Buie](https://github.com/gadenbuie/tidyexplain) ] --- # Announcements Reminders: - Submit assignments via canvas / gradescope - Homework - Week 3 was due yesterday (Monday) at 11:59pm Questions before we get started? --- # Plan for this week .pull-left[ ### Tuesday [Prologue](#prologue) [Joins](#joins) <!-- - [Join animations](#join-animations) --> <!-- - [Reference slides on joins](#joins-extra) --> [example-04-1](#example-04-1) ] .pull-right[ ### Thursday [Logic](#logic) - [Boolean algebra](#boolean-algebra) - [Conditional transformations](#conditionals) [example-04-2](#example-04-2) ] --- class: inverse, center, middle name: prologue # Prologue --- # What sports do we watch? Take a guess: what's the most popular spectator sport among classmates? -- Here are the first 20 responses: ``` ## [1] "soccer" "baseball" "baseball" "badminton" "football" ## [6] "baseball" "volleyball" "swimming" "football" "tennis" ## [11] "soccer" "football" "soccer" "volleyball" "basketball" ## [16] "hockey" "tennis" "volleyball" "baseball" "soccer" ``` -- Let's `count` and `arrange` to get the top 3: ``` ## # A tibble: 3 × 2 ## sport n ## <chr> <int> ## 1 basketball 14 ## 2 baseball 12 ## 3 soccer 12 ``` --- # R can be used for sports analytics, too! .pull-left[ <img src="img/04/stephen-curry.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="img/04/kobe-bryant.png" width="100%" style="display: block; margin: auto;" /> ] ??? Source: https://github.com/toddwschneider/ballr --- class: inverse, center, middle name: joins # Joins --- # Joins Most data analyses require information contained in multiple data frames We **join** them together to answer questions **Keys** are the variables that connect a pair of data frames in a join --- # Join verbs from dplyr 1. **Mutating joins**: add new variables - `left_join()` - `right_join()` - `inner_join()` - `full_join()` 2. **Filtering joins**: filter observations - `semi_join()` - `anti_join()` ??? You can visualize the operations [here](https://r4ds.hadley.nz/joins) --- class: inverse, center, middle name: join-animations # Join animations --- # Let's start by visualizing joins Here are two data frames we want to **join** <img src="img/04/original-dfs.png" width="45%" style="display: block; margin: auto;" /> Their **keys** are in color in the first column, and other data are in grey --- # Left join and right join Left or right joins add variables to the left or right data frames .pull-left[ <img src="img/04/left-join.gif" width="100%" style="display: block; margin: auto;" /> ] -- .pull-right[ <img src="img/04/right-join.gif" width="100%" style="display: block; margin: auto;" /> ] ??? Source: <https://github.com/gadenbuie/tidyexplain> --- # Multiple matches With multiple matches between `x` and `y`, all combinations of matches are returned <img src="img/04/left-join-extra.gif" width="45%" style="display: block; margin: auto;" /> In this example, `x2` is duplicated to join one row in `x` to multiple rows in `y` ??? Source: <https://github.com/gadenbuie/tidyexplain> --- # Inner join and full join .pull-left[ Inner joins return all rows in `x` **AND** `y` <img src="img/04/inner-join.gif" width="100%" style="display: block; margin: auto;" /> ] -- .pull-right[ Full joins return all rows in `x` **OR** `y` <img src="img/04/full-join.gif" width="100%" style="display: block; margin: auto;" /> ] ??? Source: <https://github.com/gadenbuie/tidyexplain> --- # Semi join and anti join .pull-left[ Semi joins filter rows in `x` that match `y` <img src="img/04/semi-join.gif" width="100%" style="display: block; margin: auto;" /> ] -- .pull-right[ Anti joins filter rows in `x` **not** in `y` <img src="img/04/anti-join.gif" width="100%" style="display: block; margin: auto;" /> ] ??? Source: <https://github.com/gadenbuie/tidyexplain> --- class: inverse, center, middle name: example-04-1 # example-04-1 --- class: inverse, center, middle name: joins-extra # Additional slides on joins for your reference --- # Joins Let's learn these join commands using two small data frames .pull-left[ ``` r superheroes ``` ``` ## # A tibble: 7 × 3 ## name alignment publisher ## <chr> <chr> <chr> ## 1 Magneto bad Marvel ## 2 Storm good Marvel ## 3 Mystique bad Marvel ## 4 Batman good DC ## 5 Joker bad DC ## 6 Catwoman bad DC ## 7 Hellboy good Dark Horse Comics ``` ] .pull-right[ ``` r publishers ``` ``` ## # A tibble: 3 × 2 ## publisher year_founded ## <chr> <int> ## 1 DC 1934 ## 2 Marvel 1939 ## 3 Image 1992 ``` ] ??? Source: https://stat545.com/join-cheatsheet.html --- # 1) dplyr::left_join(x, y) ``` r left_join(superheroes, publishers) ``` ``` ## Joining with `by = join_by(publisher)` ``` ``` ## # A tibble: 7 × 4 ## name alignment publisher year_founded ## <chr> <chr> <chr> <int> ## 1 Magneto bad Marvel 1939 ## 2 Storm good Marvel 1939 ## 3 Mystique bad Marvel 1939 ## 4 Batman good DC 1934 ## 5 Joker bad DC 1934 ## 6 Catwoman bad DC 1934 ## 7 Hellboy good Dark Horse Comics NA ``` `left_join` is a **mutating join**: it adds variables to `x` `left_join` returns all rows from `x` --- # 2) dplyr::right_join(x, y) ``` r right_join(superheroes, publishers) ``` ``` ## Joining with `by = join_by(publisher)` ``` ``` ## # A tibble: 7 × 4 ## name alignment publisher year_founded ## <chr> <chr> <chr> <int> ## 1 Magneto bad Marvel 1939 ## 2 Storm good Marvel 1939 ## 3 Mystique bad Marvel 1939 ## 4 Batman good DC 1934 ## 5 Joker bad DC 1934 ## 6 Catwoman bad DC 1934 ## 7 <NA> <NA> Image 1992 ``` `right_join` is a **mutating join**: it adds variables to `y` `right_join` returns all rows from `y` --- # 3) dplyr::inner_join(x, y) ``` r inner_join(superheroes, publishers) ``` ``` ## Joining with `by = join_by(publisher)` ``` ``` ## # A tibble: 6 × 4 ## name alignment publisher year_founded ## <chr> <chr> <chr> <int> ## 1 Magneto bad Marvel 1939 ## 2 Storm good Marvel 1939 ## 3 Mystique bad Marvel 1939 ## 4 Batman good DC 1934 ## 5 Joker bad DC 1934 ## 6 Catwoman bad DC 1934 ``` How is `inner_join` different from `left_join` and `right_join`? -- `inner_join` returns all rows in `x` **AND** `y` --- # 4) dplyr::full_join(x, y) ``` r full_join(superheroes, publishers) # how many rows do you think this will produce? ``` -- ``` ## Joining with `by = join_by(publisher)` ``` ``` ## # A tibble: 8 × 4 ## name alignment publisher year_founded ## <chr> <chr> <chr> <int> ## 1 Magneto bad Marvel 1939 ## 2 Storm good Marvel 1939 ## 3 Mystique bad Marvel 1939 ## 4 Batman good DC 1934 ## 5 Joker bad DC 1934 ## 6 Catwoman bad DC 1934 ## 7 Hellboy good Dark Horse Comics NA ## 8 <NA> <NA> Image 1992 ``` `full_join` returns all rows in `x` **OR** `y` --- # 5) dplyr::semi_join(x, y) .pull-left[ ``` r superheroes ``` ``` ## # A tibble: 7 × 3 ## name alignment publisher ## <chr> <chr> <chr> ## 1 Magneto bad Marvel ## 2 Storm good Marvel ## 3 Mystique bad Marvel ## 4 Batman good DC ## 5 Joker bad DC ## 6 Catwoman bad DC ## 7 Hellboy good Dark Horse Comics ``` ] .pull-right[ ``` r semi_join(superheroes, publishers) ``` ``` ## Joining with `by = join_by(publisher)` ``` ``` ## # A tibble: 6 × 3 ## name alignment publisher ## <chr> <chr> <chr> ## 1 Magneto bad Marvel ## 2 Storm good Marvel ## 3 Mystique bad Marvel ## 4 Batman good DC ## 5 Joker bad DC ## 6 Catwoman bad DC ``` ] `semi_join` is a **filtering join**: it keeps observations in `x` that have a match in `y` Note that the variables do not change --- # 6) dplyr::anti_join(x, y) .pull-left[ ``` r superheroes ``` ``` ## # A tibble: 7 × 3 ## name alignment publisher ## <chr> <chr> <chr> ## 1 Magneto bad Marvel ## 2 Storm good Marvel ## 3 Mystique bad Marvel ## 4 Batman good DC ## 5 Joker bad DC ## 6 Catwoman bad DC ## 7 Hellboy good Dark Horse Comics ``` ] .pull-right[ ``` r anti_join(superheroes, publishers) ``` ``` ## Joining with `by = join_by(publisher)` ``` ``` ## # A tibble: 1 × 3 ## name alignment publisher ## <chr> <chr> <chr> ## 1 Hellboy good Dark Horse Comics ``` ] `anti_join` is a **filtering join**: it keeps obs. in `x` that **DO NOT** have a match in `y` Note that the variables do not change --- # Key variables How do `dplyr` join commands know what variables to use as **keys**? -- By default, `*_join()` uses all variables that are common across `x` and `y` ``` r intersect(names(superheroes), names(publishers)) # variable used for matching before ``` ``` ## [1] "publisher" ``` -- Or, we can specify what to join by: `*_join(..., by = join_by(publisher))` .small[Note: before `dplyr` 1.1.0, the syntax was: `*_join(..., by = "publisher")`] --- # Exploring keys ``` r library(nycflights13) # let's explore keys using the nycflights13 data flights |> print(n = 8) # print the first 8 rows of flights ``` ``` ## # A tibble: 336,776 × 19 ## year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time ## <int> <int> <int> <int> <int> <dbl> <int> <int> ## 1 2013 1 1 517 515 2 830 819 ## 2 2013 1 1 533 529 4 850 830 ## 3 2013 1 1 542 540 2 923 850 ## 4 2013 1 1 544 545 -1 1004 1022 ## 5 2013 1 1 554 600 -6 812 837 ## 6 2013 1 1 554 558 -4 740 728 ## 7 2013 1 1 555 600 -5 913 854 ## 8 2013 1 1 557 600 -3 709 723 ## # ℹ 336,768 more rows ## # ℹ 11 more variables: arr_delay <dbl>, carrier <chr>, flight <int>, ## # tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, ## # hour <dbl>, minute <dbl>, time_hour <dttm> ``` --- # Exploring keys ``` r planes # print the first 10 rows of planes ``` ``` ## # A tibble: 3,322 × 9 ## tailnum year type manufacturer model engines seats speed engine ## <chr> <int> <chr> <chr> <chr> <int> <int> <int> <chr> ## 1 N10156 2004 Fixed wing multi… EMBRAER EMB-… 2 55 NA Turbo… ## 2 N102UW 1998 Fixed wing multi… AIRBUS INDU… A320… 2 182 NA Turbo… ## 3 N103US 1999 Fixed wing multi… AIRBUS INDU… A320… 2 182 NA Turbo… ## 4 N104UW 1999 Fixed wing multi… AIRBUS INDU… A320… 2 182 NA Turbo… ## 5 N10575 2002 Fixed wing multi… EMBRAER EMB-… 2 55 NA Turbo… ## 6 N105UW 1999 Fixed wing multi… AIRBUS INDU… A320… 2 182 NA Turbo… ## 7 N107US 1999 Fixed wing multi… AIRBUS INDU… A320… 2 182 NA Turbo… ## 8 N108UW 1999 Fixed wing multi… AIRBUS INDU… A320… 2 182 NA Turbo… ## 9 N109UW 1999 Fixed wing multi… AIRBUS INDU… A320… 2 182 NA Turbo… ## 10 N110UW 1999 Fixed wing multi… AIRBUS INDU… A320… 2 182 NA Turbo… ## # ℹ 3,312 more rows ``` --- # Let's perform a left join on flights and planes ``` r *left_join(flights, planes) |> select(year:dep_time, arr_time, carrier:tailnum, type, model) |> # keep text to one slide print(n = 5) # just to save vertical space on the slide ``` ``` ## Joining with `by = join_by(year, tailnum)` ``` ``` ## # A tibble: 336,776 × 10 ## year month day dep_time arr_time carrier flight tailnum type model ## <int> <int> <int> <int> <int> <chr> <int> <chr> <chr> <chr> ## 1 2013 1 1 517 830 UA 1545 N14228 <NA> <NA> ## 2 2013 1 1 533 850 UA 1714 N24211 <NA> <NA> ## 3 2013 1 1 542 923 AA 1141 N619AA <NA> <NA> ## 4 2013 1 1 544 1004 B6 725 N804JB <NA> <NA> ## 5 2013 1 1 554 812 DL 461 N668DN <NA> <NA> ## # ℹ 336,771 more rows ``` -- Uh-oh! What's up with `type` and `model`? --- # Uh-oh! As before, `dplyr` guessed which columns to join on It uses columns with the same name: ``` *## Joining, by = c("year", "tailnum") ``` Does anyone see a potential problem here? -- The variable `year` does not have a consistent meaning across the datasets -- In `flights` it refers to the *year of flight*, in `planes` it refers to *year of construction* -- Luckily we can avoid this by using the argument `by = join_by(...)` --- # What should we join flights and planes by? <img src="img/04/nycflights13-relational.png" width="75%" style="display: block; margin: auto;" /> --- # Specifying join keys We just need to be explicit in the join call by using the `by` argument ``` r left_join(flights, planes |> rename(year_built = year), # not necessary w/ below line, but helpful * by = join_by(tailnum) # be specific about the joining column ) |> select(year, month:dep_time, carrier, flight, tailnum, year_built, type, model) |> print(n = 5) # just to save vertical space on the slide ``` ``` ## # A tibble: 336,776 × 10 ## year month day dep_time carrier flight tailnum year_built type model ## <int> <int> <int> <int> <chr> <int> <chr> <int> <chr> <chr> ## 1 2013 1 1 517 UA 1545 N14228 1999 Fixed wing… 737-… ## 2 2013 1 1 533 UA 1714 N24211 1998 Fixed wing… 737-… ## 3 2013 1 1 542 AA 1141 N619AA 1990 Fixed wing… 757-… ## 4 2013 1 1 544 B6 725 N804JB 2012 Fixed wing… A320… ## 5 2013 1 1 554 DL 461 N668DN 1991 Fixed wing… 757-… ## # ℹ 336,771 more rows ``` --- # Specifying join keys What happens if we don't rename `year` before this join? ``` r left_join(flights, * planes, # not renaming "year" to "year_built" this time by = join_by(tailnum) ) |> select(contains("year"), month:dep_time, arr_time, carrier, flight, tailnum, type, model) |> print(n = 4) # just to save vertical space on the slide ``` ``` ## # A tibble: 336,776 × 11 ## year.x year.y month day dep_time arr_time carrier flight tailnum type model ## <int> <int> <int> <int> <int> <int> <chr> <int> <chr> <chr> <chr> ## 1 2013 1999 1 1 517 830 UA 1545 N14228 Fixe… 737-… ## 2 2013 1998 1 1 533 850 UA 1714 N24211 Fixe… 737-… ## 3 2013 1990 1 1 542 923 AA 1141 N619AA Fixe… 757-… ## 4 2013 2012 1 1 544 1004 B6 725 N804JB Fixe… A320… ## # ℹ 336,772 more rows ``` -- What is `year.x`? What is `year.y`? --- class: inverse, center, middle name: summary # Summary of key verbs so far --- # Key verbs .pull-left-4[ ### Import #### readr 1. `read_csv` 2. `write_csv` #### readxl 1. `read_excel` ] .pull-midleft-4[ ### Tidy #### tidyr 1. `pivot_longer` 2. `pivot_wider` 3. `separate_wider_delim` ] .pull-midright-4[ ### Join #### dplyr 1. `left_join` 2. `right_join` 3. `inner_join` 4. `full_join` 5. `semi_join` 6. `anti_join` ] .pull-right-4[ ### Transform #### dplyr 1. `filter` 2. `arrange` 3. `select` 4. `mutate` 5. `summarize` ] --- class: inverse, center, middle name: logic # Logic --- # Logical vectors What values can the logical data type take? -- Logical values can be `TRUE`, `FALSE`, or `NA` -- What are **logical vectors**? -- Logical vectors are just vectors ("columns") that only contain `TRUE`, `FALSE`, or `NA` --- # Logical vectors Can you think of any logical vectors we have worked with so far? -- While we don't often see logical vectors in raw data, we use them all the time! Example: every time we make comparisons to `filter()` data we create _transient_ logical variables that are computed, used, and then thrown away --- # Transient logical vectors We create a transient logical vector when we filter `flights` to Miami: ``` r library(nycflights13) flights |> select(carrier, flight, dest) |> filter(dest == "MIA") ``` ``` ## # A tibble: 11,728 × 3 ## carrier flight dest ## <chr> <int> <chr> ## 1 AA 1141 MIA ## 2 AA 1895 MIA ## 3 UA 1077 MIA ## 4 AA 1837 MIA ## 5 DL 2003 MIA ## 6 AA 2279 MIA ## 7 AA 2267 MIA ## 8 DL 1843 MIA ## 9 AA 443 MIA ## 10 DL 2143 MIA ## # ℹ 11,718 more rows ``` --- <!-- # What goes on under the hood of a filter --> # `filter(dest == "MIA")`: under the hood .pull-left[ ``` r # create a logical vector from a comparison flights |> select(carrier, flight, dest) |> * mutate(welcome_to_miami = dest == "MIA") ``` ``` ## # A tibble: 336,776 × 4 ## carrier flight dest welcome_to_miami ## <chr> <int> <chr> <lgl> ## 1 UA 1545 IAH FALSE ## 2 UA 1714 IAH FALSE *## 3 AA 1141 MIA TRUE ## 4 B6 725 BQN FALSE ## 5 DL 461 ATL FALSE ## 6 UA 1696 ORD FALSE ## 7 B6 507 FLL FALSE ## 8 EV 5708 IAD FALSE ## 9 B6 79 MCO FALSE ## 10 AA 301 ORD FALSE ## # ℹ 336,766 more rows ``` ] -- .pull-right[ ``` r flights |> select(carrier, flight, dest) |> mutate(welcome_to_miami = dest == "MIA") |> * filter(welcome_to_miami) # then filter ``` ``` ## # A tibble: 11,728 × 4 ## carrier flight dest welcome_to_miami ## <chr> <int> <chr> <lgl> *## 1 AA 1141 MIA TRUE ## 2 AA 1895 MIA TRUE ## 3 UA 1077 MIA TRUE ## 4 AA 1837 MIA TRUE ## 5 DL 2003 MIA TRUE ## 6 AA 2279 MIA TRUE ## 7 AA 2267 MIA TRUE ## 8 DL 1843 MIA TRUE ## 9 AA 443 MIA TRUE ## 10 DL 2143 MIA TRUE ## # ℹ 11,718 more rows ``` ] --- # Comparisons Numeric comparisons like `<`, `<=`, `>`, `>=`, `!=`, and `==` can be used to create logical vectors As we have seen, `==` and `!=` are useful for comparing characters (i.e., strings) --- # `is.na()` `is.na()` is a useful function for checking whether something is `NA` Why use `is.na(x)` when we could just use `x == NA`? -- ``` r x <- 2850 + 5850 is.na(x) ``` ``` ## [1] FALSE ``` -- ``` r x == NA ``` ``` ## [1] NA ``` May seem odd but it makes sense when you think about concrete comparisons --- name: boolean-algebra # Boolean algebra Use Boolean algebra to combine comparisons / logical vectors <img src="img/04/boolean-algebra.png" width="80%" style="display: block; margin: auto;" /> ??? Source: <https://r4ds.hadley.nz/logicals.html#boolean-algebra> --- # Boolean operator examples .pull-left[ ``` r flights |> select(carrier, flight, dest) |> * filter(dest == "MIA" | dest == "MYR") ``` ``` ## # A tibble: 11,787 × 3 ## carrier flight dest ## <chr> <int> <chr> ## 1 AA 1141 MIA ## 2 AA 1895 MIA ## 3 UA 1077 MIA ## 4 AA 1837 MIA ## 5 DL 2003 MIA ## 6 AA 2279 MIA ## 7 AA 2267 MIA ## 8 DL 1843 MIA ## 9 AA 443 MIA ## 10 EV 4412 MYR ## # ℹ 11,777 more rows ``` ] .pull-right[ ``` r flights |> select(carrier, flight, dest) |> * filter(carrier == "AA" & dest == "MIA") ``` ``` ## # A tibble: 7,234 × 3 ## carrier flight dest ## <chr> <int> <chr> ## 1 AA 1141 MIA ## 2 AA 1895 MIA ## 3 AA 1837 MIA ## 4 AA 2279 MIA ## 5 AA 2267 MIA ## 6 AA 443 MIA ## 7 AA 647 MIA ## 8 AA 2099 MIA ## 9 AA 1623 MIA ## 10 AA 2253 MIA ## # ℹ 7,224 more rows ``` ] --- # %in% `x %in% y` is a useful shortcut for identifying whether a value in `x` is contained in `y` ``` r flights |> select(carrier, flight, dest) |> * filter(dest %in% c("MIA", "MYR")) ``` ``` ## # A tibble: 11,787 × 3 ## carrier flight dest ## <chr> <int> <chr> ## 1 AA 1141 MIA ## 2 AA 1895 MIA ## 3 UA 1077 MIA ## 4 AA 1837 MIA ## 5 DL 2003 MIA ## 6 AA 2279 MIA ## 7 AA 2267 MIA ## 8 DL 1843 MIA ## 9 AA 443 MIA ## 10 EV 4412 MYR ## # ℹ 11,777 more rows ``` --- # Numeric operations on logical vectors Numeric operations treat `TRUE` as `1` and `FALSE` as `0`: ``` r x <- c(TRUE, TRUE, FALSE, FALSE, FALSE) ``` .pull-left[ ``` r sum(x) ``` ``` ## [1] 2 ``` ``` r mean(x) ``` ``` ## [1] 0.4 ``` ] .pull-right[ ``` r min(x) ``` ``` ## [1] 0 ``` ``` r max(x) ``` ``` ## [1] 1 ``` ] This can be handy when doing calculations that depend on conditions --- name: conditionals # Conditional transformations `if_else()` can be used to do things based on a binary condition ``` r flights |> filter(dest == "MIA") |> select(carrier, flight, dest, sched_dep_time) |> * mutate(too_early = if_else(sched_dep_time < 800, "too early!", "okay")) ``` ``` ## # A tibble: 11,728 × 5 ## carrier flight dest sched_dep_time too_early ## <chr> <int> <chr> <int> <chr> ## 1 AA 1141 MIA 540 too early! ## 2 AA 1895 MIA 610 too early! ## 3 UA 1077 MIA 607 too early! ## 4 AA 1837 MIA 610 too early! ## 5 DL 2003 MIA 700 too early! ## 6 AA 2279 MIA 700 too early! ## 7 AA 2267 MIA 755 too early! ## 8 DL 1843 MIA 800 okay ## 9 AA 443 MIA 715 too early! ## 10 DL 2143 MIA 900 okay ## # ℹ 11,718 more rows ``` --- # Conditional transformations `case_when()` is a more flexible approach that allows many different conditions .pull-left[ ``` r flights |> filter(dest == "MIA") |> select(carrier, flight, sched_dep_time) |> * mutate(too_early = case_when( * sched_dep_time < 600 ~ "too early!", * sched_dep_time < 800 ~ "still early", * sched_dep_time <= 2000 ~ "okay", * sched_dep_time > 2000 ~ "late" * ) * ) ``` Conditions are evaluated in order ] .pull-right[ ``` ## # A tibble: 11,728 × 4 ## carrier flight sched_dep_time too_early ## <chr> <int> <int> <chr> ## 1 AA 1141 540 too early! ## 2 AA 1895 610 still early ## 3 UA 1077 607 still early ## 4 AA 1837 610 still early ## 5 DL 2003 700 still early ## 6 AA 2279 700 still early ## 7 AA 2267 755 still early ## 8 DL 1843 800 okay ## 9 AA 443 715 still early ## 10 DL 2143 900 okay ## # ℹ 11,718 more rows ``` ] `condition ~ output` syntax is new; watch out for overlapping conditions! --- class: inverse, center, middle name: example-04-2 # example-04-2