GPA Photo Archive: https://www.flickr.com/photos/iip-photo-archive/27336508138

A few weeks ago, I wrote an article using historical population data from numerous US cities. While much of this data came from directly from the US Census, I also scraped population data from Wikipedia tables which compile all available data for each city in one place. …

Photo by Jonathan Meyer from Pexels

There are so many articles and social media posts out there attempting to define the difference between data analyst and data scientist roles, a goal that I’ve never quite understood. This distinction varies so widely from industry to industry and company to company that it seems impossible to draw clear…

Photo by Pixabay from Pexels

The janitor Package

The janitor package is available on CRAN and was created by Sam Firke, Bill Denney, Chris Haid, Ryan Knight, Malte Grosser, and Jonathan Zadra. While arguably best known for its extraordinarily useful clean_names() function (which I will be covering later on in this article), the janitor package has a wide…

Photo by energepic.com from Pexels

Shiny is used by many data scientists and data analysts to create interactive visualizations and web applications. While Shiny is an RStudio product and quite user-friendly, the development of a Shiny app differs significantly from the data visualization and exploration that you might do via the tidyverse in an RMarkdown…

Image source: https://commons.wikimedia.org/wiki/File:Joe_Biden_(49536511763).jpg

I published an analysis back in October in which I used Google searches to predict the results of the 2020 U.S. presidential election. I was largely focused on six swing states — Arizona, Florida, Michigan, North Carolina, Pennsylvania, and Wisconsin. …

Hands-on Tutorials, VIDEO TUTORIAL

Photo by Magda Ehlers from Pexels

One of the most common analyses conducted by data scientists is the evaluation of linear relationships between numeric variables. These relationships can be visualized using scatterplots, and this step should be taken regardless of any further analyses that are conducted. …

Photo by Element5 Digital from Pexels

Background

Most of us here in the U.S. are waiting with bated breath for the results of next week’s enormously consequential presidential election. Virtually all of the data providing insight into the likely outcome comes in the form of polling data, which, while extremely valuable, is also inherently imperfect. Selection bias…

Emily A. Halford

I am currently a data analyst working in psychiatric epidemiology, and I am excited about the intersection of data science and mental health. Views are my own.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store