Financial Portfolio Selection using Multi-factor Capital Asset Pricing Model and Importing Options Data

Category : Uncategorized

Financial Portfolio Selection using Multi-factor Capital Asset Pricing Model and Importing Options Data

Download the PDF from SSRN.
Examples: fetchcomponents.


Diversification and portfolio selection is an integral part of finance teaching. In this study, multi-factor Capital Asset Pricing Model (CAPM) is estimated for components of Dow Jones Composite Index using data from Yahoo! Finance. Along with CAPM’s Beta, other statistics are calculated that are common decision criteria for portfolio selection such as historic standard deviation (total risk), total return, average daily return, Sharpe and Treynor measures. Two new commands are introduced, components and portfolio, that automate the entire process. A third new command, fetchyahoooptions, is provided to download and parse equity options data from Yahoo! Finance web pages and, optionally, to calculate the implied volatilities for the downloaded options.
Keywords: Finance, financial data, multi-factor CAPM, Beta, diversification, portfolio selection, Sharpe, Treynor, options, implied volatility


Diversification, portfolio selection, hedging and Capital Asset Pricing Model (Sharpe (1964) and Lintner (1965)) (CAPM) are integral parts of finance teaching at undergraduate and graduate finance curriculum. Most textbooks provide detailed explanations about how to diversify, how to evaluate different financial securities and how to estimate CAPM. Considering the importance of “learning by doing” or “hands-on” approach, financial calculations and CAPM estimations are performed in classrooms as well as through assignments. There are abundant resources about these financial calculations and CAPM estimations mostly using Microsoft Excel. However, there seems to be limited resources for students to do financial calculations and CAPM estimation using an econometric software such as STATA. Considering the speed and ease of repeatability of STATA, this is an important void in finance teaching. Teaching should be more focused on theory and interpretation of results than long and tiresome steps of calculations and estimations.

It is true that MS-Excel is one of the primary software in the industry and an asset for finance students. By default, they should be able to estimate CAPM and other financial calculations using MS-Excel. However, automated tasks such as estimating CAPM for multiple stocks would require them to either learn MS-Excel macro programming or some type of econometric software. For the recent years, STATA became popular choice between academics and students, perhaps because of its ease of use or abundant resources available for STATA users.

The analysis performed in this study is of interest to finance instructors, students and investors (This study and associated STATA codes are for educational use. There is no direct or implied financial advise. While every effort is made for accuracy and reliability, the data and the results may not be accurate or reliable.). This study entails lecture notes to teach different criteria for portfolio selection, diversification and options. It shows how some of the most common statistics are calculated. It provides fast and easy commands to repeat these tasks during a study session or a lecture using STATA. It enables finance instructors to assign projects using real life data and spend time on interpretations and on methods proportional to their importance. The procedures provided with this study are also useful for investors (for educational purposes). Using real financial data, investors can compare their investment choices to achieve portfolio objectives.

In this study, the initial step is to obtain a list of stocks that make up the Dow Jones Composite Index. For this task a new STATA command is used: fetchcomponents. Then, historic prices for these stocks are downloaded using a STATA command fetchyahooquotes (Dicle (2011)). Using the same command, Fama-French factors are also downloaded to estimate multi-factor CAPM (Fama (1992) and Fama (1993)). The third step is to calculate average daily returns, total returns, standard deviation of daily returns, Sharpe (1970) (Sharpe) measure and Treynor (1965) (Treynor) measures. Within this step, multi-factor CAPM following Fama (1992) and Fama (1993) is also estimated for each stock. Another new STATA command is used to automate this process: fetchportfolio. The fourth step is the interpretation of the results. For the final step, as an introduction to hedging portfolio risk, options data are downloaded using a new STATA command, fetchyahoooptions, for a few stocks. Their implied volatilities are calculated and graphed.

Index components: fetchcomponents

There are several financial indices, such as S&P-500 and Dow Jones Industrial, available to investors. The list of stocks that make up these financial indices are usually referred to as index components. While some of these components may not be available, most of them are accessible via Yahoo! Finance web site. fetchcomponents downloads the list of symbols for the components of indices where available.



fetchcomponents downloads the list of components for an index. List of stocks (components) are provided by Yahoo! Finance.

  • symbol is the index for the components (i.e. ^NYA). There can only be one symbol defined which must be an index.

How to install
Then, click on the fetchcomponents link and then “click here to install”.

Example #1: Usage

Screen Shot 2016-08-15 at 4.36.27 PM





Options Data: fetchyahoooptions

Yahoo! Finance provides free financial data for the public use. While historic prices for most financial assets as well as some important statistics can be downloaded using Yahoo! Finance’s API (Stata commands fetchyahooquotes and fetchyahookeystats utilizes Yahoo! Finance’s API to download historic prices and key statistics.), some data are only available through web pages. This important data can be accessed via web browsers. However, to access the data as Stata usable data, these web pages need to be parsed (Parsing HTML pages is a common practice and some programming languages provide extensive language support such as PHP and Java script. There is a regular expressions parsing language that is adopted by most web programming languages.). Stata has a powerful and fast programming language: Mata. Even though it is intended as matrix programming language, it has HTTP protocol support, regular expressions and extensive string functions. It allows the newly introduced command fetchyahoooptions to fetch Yahoo! Finance options page, to parse the page and to process its contents to make them usable as Stata data. fetchyahoooptions also calculates implied volatility for downloaded options using Black (1973) option pricing formula following equations:

d_1=\frac{ln(S_{0}/K)+[(r_{f}+\frac{\sigma^2}{2})T]}{\sigma \sqrt{T}}

d_2=d_1-\sigma \sqrt{T}

c=S_{0} N(d_{1}) - K e^{-r_{f}T} N(d_{2})

p=K e^{-r_{f}T} N(-d_{2}) - S_{0} N(-d_{1})

Unlike equity prices, options data are not easily accessible for everyone as usable data. It is an important free service that Yahoo! Finance offers for public use. Options data are important for finance lecturers to be included in their lectures and teaching notes. They are important for researchers of financial derivatives. Although options data are only current and historical time-series are not available, they can be accessed daily and stored to create a time-series. Options data are also important for investors (for educational purposes). The implied volatility allows investors to have a sense of expected volatility in the market. Volatility smiles may allow investors to predict market direction.


  • namelist is a list of ticker symbols for which the options to be parsed and downloaded from Yahoo! Finance’s options web page. Symbols are separated by spaces.
  • m(string) is the maturity date in which the options expire (i.e. 2016-09-16). Multiple maturities can be included (i.e. 2016-09-16 2016-09-23).
  • iv(real) is the calculated implied volatility using Black (1973) option pricing formula following the equations below. It uses a trial and error method to loop through levels of volatilities to calculate a call/put option price that matches the ask price. Implied volatility is calculated separately for each strike price.
    d_1=\frac{ln(S_{0}/K)+[(r_{f}+\frac{\sigma^2}{2})T]}{\sigma \sqrt{T}}

    d_2=d_1-\sigma \sqrt{T}

    c=S_{0} N(d_{1}) - K e^{-r_{f}T} N(d_{2})

    p=K e^{-r_{f}T} N(-d_{2}) - S_{0} N(-d_{1})

    In these equations, S_0  refers to the spot price of the underlying security, K  to the strike price, r_f  to the risk free rate (For the risk free rate, fetchyahoooptions downloads the current ^IRX, the 13 week U.S. Treasury Bill yield index, from Yahoo! Finance.), \sigma  to the standard deviation and T  to the years to maturity.

How to install
Then, click on the fetchyahoooptions link and then “click here to install”.

HTML source code to Stata data transformation

Yahoo! Finance provides current prices of options for individual stocks through HTML pages. fetchyahoooptions utilizes Mata to access Yahoo! Finance options pages, parse them into string variables and then to turn them into usable Stata data. The following are some of the processes that fetchyahoooptions utilizes.

The following Mata function is used to get the HTML source code from the web as a string.

The following Mata function parses the current price of the underlying asset for the option.

Above functions are called in the following order.

The remaining string is parsed for individual HTML tags such as “td”, “tr”, “br” and etc. This parsing process is lengthy and can be accessed through the fetchyahoooptions.ado file. They are not provided here to conserve space.

The entire HTML source code downloaded from Yahoo! Finance is parsed into a single string using Mata. The string table that contains the options data has HTML tag “td” which can be used as line breaks which can then be converted into Stata observations. The following Stata code is used for this string split.

The resulting dataset contains nine variables: Strike, Symbol, Last, Change, Bid, Ask, Volume, Open_Interest and IV_Yahoo.

Example #1: Usage

With this example, options data are downloaded for IBM and GOOG using fetchyahoooptions for the closest maturity (September 16, 2016) and the next closest maturity (September 23, 2016). The program will also calculate the implied volatility.



I thank Jiad Alqotob, College of Business, Loyola University, New Orleans for his valuable comments during the creating of the fetchyahoooptions command.
I also thank Ashton Verdery, Department of Sociology, University of North Carolina at Chapel Hill for suggesting an improvement (implemented as suggested) in the fetchyahoooptions command to make it more reliable.





Calculating Financial Statistics and Estimating Multi-Factor CAPM

A new command, fetchportfolio, estimates the multi-factor CAPM and other financial statistics for all the stocks that make up the DJA. CAPM estimations are based on daily percentage change of dividend and split adjusted closing prices. The formula for estimating multi-factor CAPM (Fama (1992) and Fama (1993)) is as follows;
r_{i,t}-r_{f,t}=\alpha+\beta (r_{m,t}-r_{f,t}) + \gamma (SMB_t) + \theta (HML_t) +\epsilon_{t}



fetchportfolio estimates and calculates financial statistics to compare financial securities for portfolio selection. fetchyahooquotes is needed for fetchportfolio to run.

  • namelist is a list of ticker symbols for which the statistics are calculated and CAPM is estimated. Symbols are separated by spaces.
  • year(numlist) is a list of years for which the statistics are calculated and CAPM is estimated. Years are separated by spaces.

How to install
Then, click on the fetchportfolio link and then “click here to install”.

Example #1: Usage

Screen Shot 2016-08-20 at 3.44.07 PM
Screen Shot 2016-08-20 at 3.44.20 PM

Interpretting Results
Multi-factor CAPM Beta

Screen Shot 2016-08-20 at 3.46.45 PM

In the table above, there are two columns for each year: Beta and R2. The Beta is the Beta estimated with the regression equation: r_{i,t}-r_{f,t}=\alpha+\beta (r_{m,t}-r_{f,t}) + \gamma (SMB_t) + \theta (HML_t) +\epsilon_{t}  . R2 refers to the R-squared for the same regression. CAPM is estimated for each stock separately for each year. Higher Beta (absolute value of the Beta) means higher market risk for each stock. Higher R-squared means more of the variation of daily stock returns is explained by the controlled independent variables: market risk, small-minus-big and high-minus-low. As Beta increases from 2014 to 2015, we can conclude that the stocks’ market risk increased. As R-squared increased from 2014 to 2015, we can conclude that the uncontrolled factors (i.e. company specific risks) became less important factors to explain daily returns’ variation.

Total return

Screen Shot 2016-08-20 at 3.50.11 PM

Total return is the percentage change in split and dividend adjusted closing prices from the first day to the last day of the period plus the dividend yield for the year. The figures are in percentage terms (i.e. AAPL’s return for 2014 was 44.80\% whereas it was -0.35\% for 2015).

Total risk

Screen Shot 2016-08-20 at 3.51.59 PM

Total risk is the standard deviation of daily returns. Standard deviation of daily returns are based on daily percentage change of dividend and split adjusted closing prices. It is interpreted as systemic and un-systemic risks combined.

Sharpe measure

Screen Shot 2016-08-20 at 3.52.57 PM

The formula for the Sharpe measure is as follows: Sharpe = (Mean daily return – mean risk-free rate) / Standard deviation of daily returns. Mean daily returns are based on daily percentage change of dividend and split adjusted closing prices. Therefore, higher the ratio, higher the mean return per level of standard deviation (total risk).

Treynor measure

Screen Shot 2016-08-20 at 3.54.15 PM

The formula for the Treynor measure is as follows: Treynor = (Mean daily return – mean risk-free rate) / Multifactor CAPM Beta. Similar to Sharpe measure, higher the ratio, higher the mean return for level of Beta (market risk).