Posts

Adventures with R - Cricket Analysis (Predicting performance of Players based on previous ODI performances)

This is a continuing series on R and work on Cricket Analysis

Previous Series

Basic Setup

Getting all players in a single file

Clustering players based on ODI performance

This blogpost is focused on trying to predict individual player's performance on Runs the player will score based on performance in previous matches the player has played

** This code took longer to set up. The initial work was testing out different regression techniques, I kept coming back to the basic logistic regression as the model performance was not that improved between the runs. I believe the data is thin at an individual player level and therefore most regression techniques are having an issue with respecting to predicting runs scored. The model itself is not that great, it is basically predicting runs the player will score based on the balls faced, the Venue Type (Home or Away) and the Opposition Type (Strong or Weak)

Files are available at Code

Full code is reproduced for reference

library(lubridate)
libra…

Adventures with R - Cricket Analysis (Clustering players based on ODI data)

Continuing on the Cricket Analysis Series

I wanted to take a deep dive into Clustering. I have the ODI database and I thought it would be instructive to put the data to use

The main guide I used can be found HERE
This is an excellent guide to cluster analysis in R and I highly recommend it

The main code can be found HERE
The final output file can be found HERE

The final Tableau public dashboard can be found HERE


The code walk through is as

## Different packages that you need
library(mclust)
library(tidyverse)
library(cluster)
library(factoextra)
library(data.table)
library(reshape2)
library(sqldf)

setwd('C:/Training/R/CricketAnalysis/')

myData <- read.csv("ODIData.csv")
myData <- sqldf("select Player, sum(Runs) Runs, sum(Mins) Mins, sum(BF) BF, sum(Fours) Fours, sum(Sixes) Sixes, 
                  ((100 * sum(Runs))/sum(BF)) SR from myData
                group by Player")

## Removing players that score less than the median runs
myData <- sqldf("Select * f…

Adventures with R - Cricket Analysis (creating a Player Database)

One of the things I had encountered in the main serieswas the inability to create a comprehensive player database. That part of the code was fairly manual.

I kept noodling and tinkering around to get an approach for creating a comprehensive player database. This would replace the manual approach of extracting one player link at a time.

I had initially set out to use the excellent Rvest package but ran into some issues trying to decipher the xpath that is required to make the link work. I believe that the player information is coded directly as html tags on crickinfo and it would have taken me a couple of xpath loops to get the player name and then then the player profile id out. Definitely doable (but will keep Rvest for another code I have in mind..

I focused on using dplyr, tidyr to do my heavy lifting

The code can be found HERE

Code walk through. I am reproducing the first part of the code, the main code is available for everyone to look at

library(stringr)
library(sqldf)
library(dpl…

Adventures with R - Cricket Analysis

On my journey for more interesting R packages, I stumbled onto CricketR. 
CricketR is a wonderful package found on the CRAN library HEREand written/maintained by tvganesh (GitHub link) . 

Big Hat Tip to him for taking the Crickinfo Stats Guru website and converting the query/output to a R package

As usual, RStudio is the programming interface that I used. For ease of use, I go to Tools --> Global Options --> Appearance. I used Sky as my RStudio theme, Lucida console, font size 11 and Idle Fingers as my Editor theme. It gives a neutral black background and the fonts (comments, code, etc) is much more clearer to see

All the files on the code are as follows,
Main Code
Player File
Venue File
Result File

A key objective I had in this exercise
Avoid manual code runs or brute force. The package was easy enough to run one player at a time but I wanted to run it in an automated fashion for multiple players

The R package needs player id to run but Cricinfo provides the PlayerName. I was not able to …

Adventures with R - Facebook Ads (Part 3)

Now that we have set up the required packages and authentications in Facebook, it is time to go to work in R !!

RStudio is the programming interface that I used. For ease of use, I go to Tools --> Global Options --> Appearance. I used Sky as my RStudio theme, Lucida console, font size 11 and Idle Fingers as my Editor theme. It gives a neutral black background and the fonts (comments, code, etc) is much more clearer to see


First part of the code - working directory and packages (All R code is in blue so that it easy to differentiate in the blog)

getwd()

setwd("<Use your own folder directory structure here")

#install.packages("httr")
library(httr)
library(dplyr)

library(devtools)

#devtools::install_github('Rdatatable/data.table')
library(data.table)

#install.packages("cardcorp/fbRads")
library(fbRads)
library(rlist)
library(RODBC)

library(stringr)

The next is a two step process.

Step 1 - One time to get the authentication for the first time
# Getting the aut…