This has become significantly more complicated with successive iterations; see an earlier version in the history (like 2633d5b) for a simpler example.
Note the geographical caveats in the NYT dataset; the most annoying of which are the fact that the 5 NYC boroughs are all lumped together and Kansas City, MO reports its numbers separately of the 4 counties it touches. Further, note that US territories and the District of Columbia are listed as "states."
This dataset is joined with the US Census population estimates for 2019 to facilitate rates per population, but special care is needed for those caveats. In particular, the city of KCMO is split between portions of 4 different counties (Cass, Clay, Jackson and Platte) and does not wholly encompass any one of them. Since KCMO is reported separately, we pull its population estimate from the census' 2018 estimate (2019 is not yet available), and then remove population as appropriate from each of those four counties (using the 2018 breakdown available at MARC).
Population estimates are not applied to US territories (so population normalization is not available for them), and some states report cases from "Unknown" counties (with 0 population).
This is setup to be hosted on free-tier Heroku dyno. In the free tier, the app gets killed after 30 minutes of inactivity and we're given terribly anemic access to the CPU. To reduce startup costs, a special heroku buildpack that uses PackageCompiler.jl is under development.