Whimsical Otter: generating project code names with R

I love assigning project code names. However, after a certain point the increase in projects decrease the amount of time I am able to commit to assigning a meaningful code name to each one. I needed something more scalable that I could quickly use and revisit later if the idea or prototype had any legs to stand on. Why spend all that time on something if it would never see the light of day? ...

May 24, 2021 · Christopher Yee

[Updated] US firearm sales in 2020

My original exploratory analysis on the topic can be found at Firearm Sales: How are Americans coping with 2020? This post is a quick #rstats follow-up to visualize the final tally for 2020 data. Load libraries library(tidyverse) library(lubridate) library(scales) Download & parse data df_raw <- read_csv("https://raw.githubusercontent.com/BuzzFeedNews/nics-firearm-background-checks/master/data/nics-firearm-background-checks.csv") df <- df_raw df_clean <- df %>% filter(month >= "2016-01" & month < "2021-01") %>% select(month, state, handgun, long_gun) %>% arrange((month)) %>% mutate(month = as.Date(paste0(month, "-01"))) %>% group_by(month) %>% summarize(handgun = sum(handgun), long_gun = sum(long_gun)) %>% mutate(index_month = as.factor(month(month, label = TRUE)), index_year = as.factor(year(month))) %>% ungroup() Visualize data df_clean %>% group_by(index_year) %>% mutate(handgun = cumsum(handgun), long_gun = cumsum(long_gun)) %>% ungroup() %>% select(month, index_month, index_year, handgun, long_gun) %>% pivot_longer(handgun:long_gun, names_to = "type") %>% ggplot(aes(index_month, value, color = index_year, group = index_year)) + geom_line() + geom_point() + scale_y_continuous(labels = comma_format()) + scale_color_brewer(palette = 'Paired') + expand_limits(y = 0) + facet_grid(type ~ .) + labs(color = NULL, x = NULL, y = NULL, title = "NICS Firearm Background Checks: monthly cumulative per year by type", caption = "by: @eeysirhc\nsource: Federal Bureau of Investigation") + theme_bw() + theme(legend.position = 'top') ...

March 5, 2021 · Christopher Yee

Visualizing FB spend: image vs video creative

Objective: plot the comparison of total Facebook spend between image and video creatives for a small sample of DTC brands. The original piece without any visualization (e.g. tabulated data) can be found here but the main takeaway: Though it can be tempting to go all in on video assets, I intend to use this data as added inspiration to continue investing in and testing Images. Load modules import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns sns.set(style='darkgrid') Encode data labels = ['brand', 'total_spend', 'pct_image_spend', 'image_cpa', 'pct_video_spend', 'video_cpa'] df = [['Brand 1', 1880000, 17, 773, 83, 805], ['Brand 2', 1630000, 57, 350, 44, 463], ['Brand 3', 1610000, 34, 179, 66, 188], ['Brand 4', 1300000, 12, 132, 88, 169], ['Brand 5', 1230000, 63, 46, 37, 40], ['Brand 6', 800000, 15, 22, 85, 24], ['Brand 7', 690000, 7, 120, 93, 127], ['Brand 8', 590000, 87, 18, 13, 28], ['Brand 9', 400000, 3, 47, 97, 0.63], ['Brand 10', 230000, 24, 48, 75, 114], ['Brand 11', 220000, 20, 25, 80, 21], ['Brand 12', 180000, 40, 57, 59, 51], ['Brand 13', 170000, 3, 47, 95, 59], ['Brand 14', 120000, 13, 17, 90, 13]] df = pd.DataFrame(df) df.columns = labels Define function We will use this simple method to categorize the brands and their different ad spend levels on Facebook. ...

February 10, 2021 · Christopher Yee

10 Lessons Learned from 10 Years of Search Marketing

July 2020 marked the 10 year anniversary for my blog. If you asked me a decade ago what I would be blogging about now my answer would be SEO. I never would have guessed it would shift to data science and data visualization topics. To end this chaotic year on a high note I want to share the top 10 things I learned over the course of my career in search engine marketing. I hope someone will find this useful regardless of industry and tenure in their field. ...

December 31, 2020 · Christopher Yee

2020 US Elections: calculating win thresholds

The 2020 US presidential election is coming down to the wire and I thought it would be fun to share how I calculate the answer to the following question: What is the distribution of the remaining ballots that Biden/Trump needs to win the electoral college votes for a given state? If we join this data point with Biden vs Trump mail-in ballot rates, or any other information for that matter, then it returns a fairly decent estimate on how thin/wide the margins will be for the US presidential race. ...

November 6, 2020 · Christopher Yee

From deterministic to probabilistic SEM bid optimization

The goal of every search engine marketing (SEM) advertiser is to maximize their returns at the lowest possible cost. Campaign performance is primarily tuned by adjusting the maximum cost per click (CPC) bid for each ad. However, finding the “perfect” CPC bid can be a moving target since the auction is constantly in flux. The “sleeper” problem Imagine an extreme (but likely) scenario where ad spend is significantly over the allocated budget for the month. The SEM expert can do a few things to quickly get the account back in shape: ...

October 1, 2020 · Christopher Yee

California Wildfires: cumulative acres burned over time

Wildfires are raging across California (again). Always knew I would end up in hell but I imagined it was more of a spontaneous combustion type of event rather than a gradual descent into the infernal #everythingisfine pic.twitter.com/gl6otozX6f — Christopher Yee (@Eeysirhc) September 8, 2020 What I noticed over the years of “doom watching” is how the news only report on tabulated data. They lacked any sort of visualization to underscore the impact of these fires. ...

September 16, 2020 · Christopher Yee

Firearm Sales: How are Americans coping with 2020?

The US has a peculiar relationship with guns where we frequently observe nontrivial spikes in firearm sales. These are triggered (pun intended) by various political, economic, and social events at the time. With 2020 being an especially chaotic year, I wanted to explore how that phenomenon is reflected in American gun purchases thus far. Load modules import pandas as pd import matplotlib.pyplot as plt import seaborn as sns Retrieve data I was unable to find confirmed and accurate gun purchase data that is free on the web. However, we can use data from the FBI’s National Instant Criminal Background Check System (NICS) as a proxy. ...

September 14, 2020 · Christopher Yee