I’ve lately been intrigued by doing some of my own analysis of the stock market and so I naturally decided to write some code to do so. I’m not a huge fan of doing this kind of analysis in Excel, because I don’t find it a repeatable/scaleable way of doing a lot of analysis across various equities. It’s hard to have a template and have live data flow into it without having to do a ton of setup with commercial plugins on Windows. I guess I also just find the Excel work daunting and it forces you to only be able to use it from one computer.
Finance geeks will likely disagree with this, but my goals are also likely different. I’m basically trying to go for coverage across lots of different stocks, go semi-deep and easily update the inputs/assumptions vs. going super deep on a much smaller select few with harder to update inputs. It’s a bit of tapping in the dark to find a light switch (none found so far FYI) but my approach is to intentionally go through a process of learning about the stock market and about coding at the same time.
My approach also intentionally starts out with a large, growing and automatically updating opportunity set, i.e. putting the kitchen sink in and then starting to slowly clean out. I’m a rookie at this, what can I say, but it’s letting me explore a lot of different industries, companies and most importantly, I’m learning a lot about analysis metrics that I’m simply not that familiar with.
Long story short, I decided to use PHP to write some of the analysis framework and tools, because it’s client/server, I’m somewhat familiar, the syntax is simple and there’s tons of great documentation. It’s definitely not resource-efficient by classical standards (buy/rent a bigger box!) and it does make some things more complicated (forces you into client/server, not desktop, but that’s good thing). Yet there was really not much choice because I could readily do the basics of the rest of LAMP to be up and running pretty quickly. I’m also not at all on my way to writing high-frequency or algorithmic trading code (hot stuff these days), so an interpreted language was just fine.
And so as part of this, I needed some basic financial numerical and statistical analysis packages. PHP has the very very basics built-in and supposedly some others are available in a PECL library, but it’s a messy install. I therefore proudly present to you my homebrew solution and first piece of open-sourced code, “PHP StatFin 0.1”, released under the Apache License 2.0. The Apache License is just about the least restrictive, which is why I chose it. Although it’s short and simple code, I’m pretty happy to give a little code back, even though the application is highly nichey.
In the package there are two files containing 4 useful functions:
- Covariance of two arrays. Note: this is a covariance of two known samples, not two random samples, so it divides by n, not n-1, exactly like Excel does.
- Variance of an array. Note: divides by n-1, like Excel.
- Beta of a set of equity prices vs. a set of index prices.
- Beta of a set of equity return vs. a set of index returns.
As you can tell, it’s pretty bare bones right now, but as I make updates, I will release them. You can view the files after the jump.
stat_fin.php:
<?php /* * Copyright 2011 Tim Trampedach * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. * * PHP StatFin 0.1 * * stat_fin.php */ require_once "stat_core.php"; /* * fin_beta_abs_asc: Calculates the beta of equity $arr1 based on market * $arr2. Both arrays expect absolute values of equity prices, not relative * returns, in ascending chronological order. */ function fin_beta_abs_asc ($arr1, $arr2) { if (sizeof($arr1) != sizeof($arr2)) { return null; } $ret1 = array(); $ret2 = array(); $num = count($arr1); for ($i = 1; $i
stat_core.php:
<?php /* * Copyright 2011 Tim Trampedach * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. * * PHP StatFin 0.1 * * stat_core.php */ /* * stat_covariance: Calculates the covariance of a known sample and thus * divides by n, not n-1 as would be appropriate for an estimation of a * random sample. Identical to Excel calculation. Returns null if array * sizes differ. */ function stat_covariance ($arr1, $arr2) { if (sizeof($arr1) != sizeof($arr2)) { return null; } $k = stat_sum_product_mean_deviation($arr1, $arr2); return $k / (sizeof($arr1)); } /* * stat_variance: Calculates the variance of a sample, dividing by n-1 as * done in Excel. Returns zero if empty array. */ function stat_variance($arr) { $sum = 0; $num = count($arr); $avg = array_sum($arr)/count($arr); for($i=0; $i<$num; $i++) { $sum = $sum + stat_square_mean_deviation($arr[$i], $avg); } return ($sum/($num-1)); } function stat_sum_product_mean_deviation($arr1, $arr2) { $sum = 0; $num = count($arr1); $avg1 = array_sum($arr1)/count($arr1); $avg2 = array_sum($arr2)/count($arr2); for($i=0; $i