Christopher Yee

From deterministic to probabilistic SEM bid optimization

The goal of every search engine marketing (SEM) advertiser is to maximize their returns at the lowest possible cost. Campaign performance is primarily tuned by adjusting the maximum cost per click (CPC) bid for each ad. However, finding the “perfect” CPC bid can be a moving target since the auction is constantly in flux. The “sleeper” problem Imagine an extreme (but likely) scenario where ad spend is significantly over the allocated budget for the month. The SEM expert can do a few things to quickly get the account back in shape: ...

California Wildfires: cumulative acres burned over time

Wildfires are raging across California (again). Always knew I would end up in hell but I imagined it was more of a spontaneous combustion type of event rather than a gradual descent into the infernal #everythingisfine pic.twitter.com/gl6otozX6f — Christopher Yee (@Eeysirhc) September 8, 2020 What I noticed over the years of “doom watching” is how the news only report on tabulated data. They lacked any sort of visualization to underscore the impact of these fires. ...

Firearm Sales: How are Americans coping with 2020?

The US has a peculiar relationship with guns where we frequently observe nontrivial spikes in firearm sales. These are triggered (pun intended) by various political, economic, and social events at the time. With 2020 being an especially chaotic year, I wanted to explore how that phenomenon is reflected in American gun purchases thus far. Load modules import pandas as pd import matplotlib.pyplot as plt import seaborn as sns Retrieve data I was unable to find confirmed and accurate gun purchase data that is free on the web. However, we can use data from the FBI’s National Instant Criminal Background Check System (NICS) as a proxy. ...

On Deep Work

Ask my wife anything about me and she’ll be the first to testify that I have more books than clothing (lol). One book that dramatically changed my life and highly recommend reading is “Deep Work” by Cal Newport. Stuck at home, bored of Netflix/Hulu/Disney+ and need a reprieve from the onslaught of #coronavirus news? Here are my top 20 favorite non-business books (out of 250+) I recommend to get your mind off things pic.twitter.com/cC93iEaOFQ ...

Build a loan amortization schedule with Python

With mortgage rates at a historical low there are inklings the US housing market is heating up again. Buying a home is a huge decision and in a perfect world everyone weighs their options and makes a (relatively) rational choice. One approach is to lay out all the mortgage offers and compare how much more we’re paying over the life of the loan. In this article I want to achieve a few things: ...

Visualizing the relationship between quality score & CPC

The SEM industry has published a lot of information about the importance of improving quality score to lower average cost per click (CPC). Most of those articles, however, just share a table with quality score in one column and its associated % increase/decrease to average CPC in the other. Although helpful I think it misses the mark on underscoring the magnitude of how much QS can help CPC. We will do something different: the python code below will take that data and visualize the impact to average CPC for a given quality score. ...

Star Wars: exploring Lucas vs Disney era ticket sales

With the end of the latest Star Wars trilogy, I wanted to compare, contrast, and explore Lucas vs Disney era domestic box office revenue. The analysis and python code below will parse weekly ticket sales from Box Office Mojo, adjust revenue numbers for inflation, visualize, and attempt to uncover insights from the data. TL;DR The top 3 revenue generating films (inflation-adjusted) are the first movie for each trilogy Disney era films do not make it past week 20 compared to the Lucas era On average, Lucas era movies generate 80% of their revenue within the first 10 weeks of release while Disney takes 2.8 weeks Load modules import pandas as pd import matplotlib.pyplot as plt import seaborn as sns sns.set(style="darkgrid") Define function def movie_revenue(era, trilogy, movie, url): # RETRIEVE DATA FROM URL movie_data = pd.read_html(url)[0] # CUMULATIVE REVENUE: TRANSFORM TO FLOAT AND CALCULATE PER MILLION (1e6) movie_data['Cumulative_Revenue'] = movie_data['To Date'].str.replace(',', '').str.replace('$', '').astype(int) movie_data['Cumulative_Revenue'] = movie_data['Cumulative_Revenue'] / 1e6 # WEEKLY REVENUE: TRANSFORM TO PER MILLION (1e6) FLOAT movie_data['Weekly_Revenue'] = movie_data['Weekly'].str.replace(',', '').str.replace('$', '').astype(int) movie_data['Weekly_Revenue'] = movie_data['Weekly_Revenue'] / 1e6 # SELECT WEEK INDEX & REVENUE DATA movie_data = movie_data[['Week', 'Weekly_Revenue', 'Cumulative_Revenue']] # ADD ADDITIONAL COLUMNS movie_data['era'] = era movie_data['trilogy'] = trilogy movie_data['movie'] = movie return(movie_data) Set parameters # LIST: PRODUCER ERA, TRILOGY, MOVIE NAME, URL TO CRAWL sw_list = [['Lucas','Prequel','The Phantom Menace','https://www.boxofficemojo.com/release/rl2742257153/weekly/'], ['Lucas','Prequel','Attack of the Clones','https://www.boxofficemojo.com/release/rl2809366017/weekly/'], ['Lucas','Prequel','Revenuge of the Sith','https://www.boxofficemojo.com/release/rl2943583745/weekly/'], ['Lucas','Original','A New Hope','https://www.boxofficemojo.com/release/rl2759034369/weekly/'], ['Lucas','Original','The Empire Strikes Back','https://www.boxofficemojo.com/release/rl2775811585/weekly/'], ['Lucas','Original','Return of the Jedi','https://www.boxofficemojo.com/release/rl2792588801/weekly/'], ['Disney','Sequel','The Force Awakens','https://www.boxofficemojo.com/release/rl2691925505/weekly/'], ['Disney','Sequel','The Last Jedi','https://www.boxofficemojo.com/release/rl2708702721/weekly/'], ['Disney','Sequel','The Rise of Skywalker','https://www.boxofficemojo.com/release/rl3305145857/weekly/'], ['Disney','SW Story','Rogue One','https://www.boxofficemojo.com/release/rl2557707777/weekly/'], ['Disney','SW Story','Solo','https://www.boxofficemojo.com/release/rl1954383361/weekly/']] Retrieve and parse data star_wars = [] for m in sw_list: data = movie_revenue(m[0], m[1], m[2], m[3]) star_wars.append(data) star_wars = pd.concat(star_wars) Spot check from random import randint star_wars.iloc[randint(0,len(star_wars))] ## Week 10 ## Weekly_Revenue 0.762516 ## Cumulative_Revenue 514.326 ## era Disney ## trilogy Sequel ## movie The Rise of Skywalker ## Name: 9, dtype: object Inflation-adjusted revenue # MOVIE TITLE m = [] for t in sw_list: title = t[2] m.append(title) # YEAR OF MOVIE y = [2000, 2002, 2005, 1977, 1980, 1983, 2016, 2018, 2019, 2017, 2018] # INFLATION RATE (https://www.bls.gov/data/inflation_calculator.htm) i = [0.497, 0.433, 0.32, 3.254, 2.129, 1.588, 0.074, 0.027, 0.008, 0.052, 0.027] # JOIN ALL THREE LISTS inflation = list(zip(m, y, i)) # CONVERT TO PANDAS DATAFRAME inflation = pd.DataFrame(inflation) # CREATE & APPLY COLUMN NAMES labels = ["movie", "year", "inflation_rate"] inflation.columns = labels # COMBINE DATAFRAMES star_wars_adjusted = star_wars.merge(inflation, how='left', on='movie') # INFLATION-ADJUSTED REVENUE: WEEKLY & CUMULATIVE star_wars_adjusted['Adjusted Weekly Revenue'] = star_wars_adjusted['Weekly_Revenue'] * (1 + star_wars_adjusted['inflation_rate']) star_wars_adjusted['Adjusted Cumulative Revenue'] = star_wars_adjusted['Cumulative_Revenue'] * (1 + star_wars_adjusted['inflation_rate']) Final spot check star_wars_adjusted.iloc[randint(0,len(star_wars_adjusted))] ## Week 46 ## Weekly_Revenue 0.792809 ## Cumulative_Revenue 213.955 ## era Lucas ## trilogy Original ## movie A New Hope ## year 1977 ## inflation_rate 3.254 ## Adjusted Weekly Revenue 3.37261 ## Adjusted Cumulative Revenue 910.166 ## Name: 101, dtype: object What is the total ticket sales for the entire franchise? star_wars_adjusted.agg({'Adjusted Weekly Revenue':'sum'})[0] ## 6125.216885688999 Over $6.6 billion in gross revenue (inflation-adjusted) was made from the US box office alone. ...

Examining drug effectiveness studies via simulation

One of my dogs was recently diagnosed with an enlarged heart so the vet prescribed some medicine to mitigate the problem. The box came with a pamphlet which included the company’s effectiveness study for the drug, Vetmedin. I thought it would be fun to visualize one portion of the study with simulation. What follows is the #rstats code I used to examine and review the drug’s efficacy based on the reported results. ...