Importing presidential approval poll results

Category : Uncategorized

Importing presidential approval poll results

Download the PDF from SSRN.
Examples: approval.

Abstract

The American Presidency Project provides presidential job approval poll results. These data are available for each U.S. president since President Franklin D. Roosevelt and for all the job approval polls conducted. The proposed Stata command, approval, downloads these presidential approval poll results in its original format, an HTML table. The approval then parses the HTML table and prepares the data as a usable Stata dataset.
Keywords: Presidential job approval, presidential popularity, U.S. presidents, parse HTML

Introduction

The American Presidency Project provides wide range of valuable data related to the U.S. presidents. Among these publicly available data, presidential job approval poll results are compiled by Gerhard Peters using the Gallup Polls. These data are available for each U.S. president and for all the presidential job approval polls conducted since President Franklin D. Roosevelt. The approval data are available through The American Presidency Project web page in HTML format. The poll results can be copied and pasted into a text editor for further editing before the data can become usable by Stata.

The proposed Stata command, approval, automates the process of accessing and parsing the presidential approval data. The data are available for each president separately. With the use of the approval, poll results are accessed, downloaded as HTML and then parsed. The end result is the presidential job approval poll results dataset usable for Stata. With approval, poll results may be processed either for an individual U.S. president or for multiple presidents. If multiple presidents are preferred then the data are appended as the presidency number may be used as the panel variable.

The approval command

The presidential job approval poll results are provided through the following web site; http://www.presidency.ucsb.edu/data/popularity.php?pres=44. As the number for the field pres within the url changes, the results for the corresponding U.S. president are provided. Through the above url, list of HTML tables are provided. All but one of these tables are related to the web page content other than the presidential job approval poll results. As its first step, approval fetches the above url as a string variable. After the web content retrieval, each table within the table (“table”) HTML tags are parsed into a string vector. Since the table with the first column and first row content that is equal to “President:” belongs to the presidential job approval poll results, the corresponding vector cell is kept and others are discarded. The vector cell that contains the data is then assigned to a string and all end of row HTML tags (“/tr”) are replaced with carriage return (char(13)). Up to this point, the approval uses MATA code. The resulting string variable is tokenized by carriage returns, transposed and transferred to Stata as a string variable. The final processing with Stata splits each observation (each table row of the data) using the end of column HTML tags (“/td”). The resulting data have columns of the original table as the variables and rows of the original table as the observations. Two additional variables are generated; 1) president which contains the name of the president and 2) president2 which contains the presidency number of the president. All variables are formatted to their original formats; string for president, float for president2, byte for approving/disapproving/unsure and float for startdate/enddate.

Important Mata functions used in the approval code

The following paragraphs provide the Mata functions and code used in the approval code. These functions are for general parsing purposes and can be used in creating other Stata commands that parse HTML code.


Syntax

Options
  • president(numlist>31 integer) is the list of U.S. presidents’ presidency numbers. The list may contain only one president or multiple presidents. The name of the president will become the content of the variable president which will be based on the presidency number provided. The presidency number will become the content of the variable president2. Presidential numbers are as follows; Franklin D. Roosevelt is the 32nd president, Harry S. Truman is the 33rd president, Dwight D. Eisenhower is the 34th president, John F. Kennedy is the 35th president, Lyndon B. Johnson is the 36th president, Richard Nixon is the 37th president, Gerald R. Ford is the 38th president, Jimmy Carter is the 39th president, Ronald Reagan is the 40th president, George Bush is the 41st president, William J. Clinton is the 42nd president, George W. Bush is the 43rd president and Barack Obama is the 44th president.


How to install
Then, click on the approval link and then “click here to install”.

Example #1: Single U.S. president’s job approval poll results

With this example presidential job approval poll results for President Barack Obama, 44th U.S. president, are downloaded and parsed.

Screen Shot 2016-08-14 at 1.21.16 PM

approval.ado

approval.hlp

approval.pkg

stata.toc