class: center, middle, inverse, title-slide #
Creating
Interactive Graphics for the Web using R ### Carson Sievert ### Slides:
https://talks.cpsievert.me
Twitter:
@cpsievert
GitHub:
@cpsievert
Email:
cpsievert1@gmail.com
Web:
https://cpsievert.me/
Slides released under
Creative Commons
--- background-image: url(workflow.svg) background-size: contain class: inverse # Data Science Workflow ??? I love this diagram from the R for Data Science book. Concisely captures the main components. --- background-image: url(workflow1.svg) background-size: contain class: inverse # <font color="#cc1f29"> Expository vis </font> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> ### The web is the preferred medium for communicating results. ### Assuming you know _exactly_ what you want to visualize, many good JavaScript frameworks exist! --- background-image: url(workflow2.svg) background-size: contain class: inverse # <font color="#cc1f29"> Exploratory vis </font> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> ### JavaScript lacks tools for modeling/transformation ### Too often, analysts juggle several technolgies (R, Python, JavaScript) ??? JavaScript lacks tools for iteration (necessary for exploration/discovery!) --- background-image: url(../gifs/swamp.gif) background-size: contain class: inverse, center, bottom ## It is all too easy for statistical thinking to be **swamped by programming tasks.** -- Brian D. Ripley ??? So, this is me, in my 2nd year of grad school, deciding to learn D3 & JavaScript. It took me 6+ months to implement a single interactive visualization. And let me tell you, you guys, no joke, believe me, I arose from the swamp, and decide I alone will... --- background-image: url(../gifs/drain-the-swamp.gif) background-size: contain class: inverse, center # ☝ 🍊 --- class: middle # How to drain the swamp? R package(s) for creating interactive web graphics which: 1. Don't require knowledge of web technologies (**start-up cost**) 2. Produce standalone HTML whenever possible (**hosting/maintenance cost**) 3. Work well with other "tidy" tools in R (**iteration cost**) 4. Link to external vis libraries (**startover cost**) 5. Easy to use int. techniques that <i>support data analysis tasks</i> (**discovery cost**) --- class: middle # How to drain the swamp? R package(s) for creating interactive web graphics which: 1. <font color="#cc1f29">Don't require knowledge of web technologies</font> (**start-up cost**) 2. Produce standalone HTML whenever possible (**hosting/maintenance cost**) 3. Work well with other "tidy" tools in R (**iteration cost**) 4. Link to external vis libraries (**startover cost**) 5. Easy to use int. techniques that <i>support data analysis tasks</i> (**discovery cost**) .footnote[ <font color="#cc1f29"> We can all get behind this, right?</font> ] --- class: middle # How to drain the swamp? R package(s) for creating interactive web graphics which: 1. Don't require knowledge of web technologies (**start-up cost**) 2. <font color="#cc1f29">Produce standalone HTML whenever possible</font> (**hosting/maintenance cost**) 3. Work well with other "tidy" tools in R (**iteration cost**) 4. Link to external vis libraries (**startover cost**) 5. Easy to use int. techniques that <i>support data analysis tasks</i> (**discovery cost**) .footnote[ <font color="#cc1f29"> As opposed to a client-server framework</font> ] --- background-image: url(server-client.svg) background-size: contain --- class: middle # How to drain the swamp? R package(s) for creating interactive web graphics which: 1. Don't require knowledge of web technologies (**start-up cost**) 2. Produce standalone HTML whenever possible (**hosting/maintenance cost**) 3. <font color="#cc1f29">Work well with other "tidy" tools in R</font> (**iteration cost**) 4. Link to external vis libraries (**startover cost**) 5. Easy to use int. techniques that <i>support data analysis tasks</i> (**discovery cost**) .footnote[ <font color="#cc1f29"> Let me give you an example with plotly</font> ] --- background-image: url(europe.png) background-size: contain class: inverse --- ```r library(tidyverse) library(ggplot2) *# read and clean data d <- read_csv('GEOSTAT_grid_POP_1K_2011_V2_0_1.csv') %>% rbind(read_csv('JRC-GHSL_AIT-grid-POP_1K_2011.csv') %>% mutate(TOT_P_CON_DT = '')) %>% mutate( lat = as.numeric(gsub('.*N([0-9]+)[EW].*', '\\1', GRD_ID))/100, lng = as.numeric(gsub('.*[EW]([0-9]+)', '\\1', GRD_ID)) * ifelse(gsub('.*([EW]).*', '\\1', GRD_ID) == 'W', -1, 1) / 100 ) %>% filter(lng > 25, lng < 60) %>% group_by(lat = round(lat, 1), lng = round(lng, 1)) %>% summarize(value = sum(TOT_P, na.rm = T)) %>% ungroup() %>% tidyr::complete(lat, lng) *# visualize ggplot(d, aes(lng, lat + 5*(value / max(value, na.rm = T)))) + geom_line( aes(group = lat, text = paste("Population:", value)), size = 0.4, alpha = 0.8, color = '#5A3E37', na.rm = T ) + coord_equal(0.9) + ggthemes::theme_map() ``` --- ```r library(tidyverse) *library(plotly) d <- read_csv('GEOSTAT_grid_POP_1K_2011_V2_0_1.csv') %>% rbind(read_csv('JRC-GHSL_AIT-grid-POP_1K_2011.csv') %>% mutate(TOT_P_CON_DT = '')) %>% mutate( lat = as.numeric(gsub('.*N([0-9]+)[EW].*', '\\1', GRD_ID))/100, lng = as.numeric(gsub('.*[EW]([0-9]+)', '\\1', GRD_ID)) * ifelse(gsub('.*([EW]).*', '\\1', GRD_ID) == 'W', -1, 1) / 100 ) %>% filter(lng > 25, lng < 60) %>% group_by(lat = round(lat, 1), lng = round(lng, 1)) %>% summarize(value = sum(TOT_P, na.rm = T)) %>% ungroup() %>% tidyr::complete(lat, lng) *# make each latitude "highlight-able" *sd <- crosstalk::SharedData$new(d, ~lat) ggplot(sd, aes(lng, lat + 5*(value / max(value, na.rm = T)))) + geom_line( aes(group = lat, text = paste("Population:", value)), size = 0.4, alpha = 0.8, color = '#5A3E37', na.rm = T ) + coord_equal(0.9) + ggthemes::theme_map() *ggplotly() ``` --- <iframe src="europe.html" width="100%" height="750" scrolling="no" seamless="seamless" frameBorder="0"> </iframe> --- class: middle # How to drain the swamp? R package(s) for creating interactive web graphics which: 1. Don't require knowledge of web technologies (**start-up cost**) 2. Produce standalone HTML whenever possible (**hosting/maintenance cost**) 3. Work well with other "tidy" tools in R (**iteration cost**) 4. <font color="#cc1f29"> Link to external vis libraries </font> (**startover cost**) 5. Easy to use int. techniques that <i>support data analysis tasks</i> (**discovery cost**) .footnote[ <font color="#cc1f29"> Let's see an example with plotly and leaflet</font> ] --- background-image: url(plotlyLeaflet.gif) background-size: contain ## plotly & leaflet --- ```r library(plotly) library(leaflet) library(crosstalk) *# use uniform/standard data structures! *sd <- SharedData$new(quakes) p <- plot_ly(sd, x = ~depth, y = ~mag) %>% add_markers(alpha = 0.5) %>% highlight("plotly_selected", dynamic = TRUE) map <- leaflet(sd) %>% addTiles() %>% addCircles() bscols(widths = c(6, 6), p, map) ``` --- class: middle # How to drain the swamp? R package(s) for creating interactive web graphics which: 1. Don't require knowledge of web technologies (**start-up cost**) 2. Produce standalone HTML whenever possible (**hosting/maintenance cost**) 3. Work well with other "tidy" tools in R (**iteration cost**) 4. Link to external vis libraries (**startover cost**) 5. <font color="#cc1f29">Easy to use int. techniques that <i>support data analysis tasks</i></font> (**discovery cost**) .footnote[ <font color="#cc1f29">Hard problem -- statisticians should be more involved here!</font> ] --- class: inverse, middle, center # Interactive graphics software must be opiniated ## Not enough statisticians influence design/implementation ## We used to be better at this!!! <http://stat-graphics.org/movies/> --- class: inverse, middle, center # Interactive graphics software must be opiniated ## Not enough statisticians influence design/implementation ## We used to be better at this!!! <http://stat-graphics.org/movies/> .footnote[ ## Should we be teaching more JavaScript and less C? 🙀 ] --- background-image: url(taxonomy.svg) background-size: contain class: bottom, inverse ## Techniques that support data analysis [Cook et al 1996](https://www.jstor.org/stable/1390754) --- ## Finding Gestalt & posing queries <iframe src="https://player.vimeo.com/video/192528320" width="100%" height="550" frameborder="0" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe> --- ## Making comparisons <iframe src="compare.html" width="100%" height="500" scrolling="no" seamless="seamless" frameBorder="0"> </iframe> --- background-image: url(tour.gif) background-size: contain class: bottom ## Finding Gestalt & "making comparisons" --- background-image: url(taxonomy2.svg) background-size: contain class: inverse ## Important design questions <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> ### <font color="#cc1f29">What kind of visualizations should be possible/easy?</font> --- background-image: url(taxonomy3.svg) background-size: contain class: inverse ## Important design questions <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> ### <font color="#cc1f29">How do we help users find the right view?</font> --- background-image: url(taxonomy4.svg) background-size: contain class: inverse ## Important design questions <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> <br /> ### <font color="#cc1f29">How to best enable dynamic answers to (statistical) questions via interactivity?</font> --- ## In summary * There is a lack of tools for _exploratory_ data visualization on the web. * JavaScript is not designed to do statistical computing. * Lets create R interfaces that leverage the computing resources of R with the interactivity of JavaScript! --- class: middle, center ## Thanks! Questions? Slides: <https://talks.cpsievert.me> <br /> #### Learn more Plotly book: <https://plotly-book.cpsievert.me> <br /> PhD thesis: <https://github.com/cpsievert/phd-thesis> <br /> #### Contact Twitter: <a href='https://twitter.com/cpsievert'>@cpsievert</a> <br /> GitHub: <a href='https://github.com/cpsievert'>@cpsievert</a> <br /> Email: <cpsievert1@gmail.com> <br /> Web: <https://cpsievert.me/>