Rahul Sangole
http://rsangole.netlify.com/
Recent content on Rahul SangoleHugo -- gohugo.ioen-usTue, 27 Feb 2018 00:00:00 +0000453 Text Analytics - Class Project - Aviation Safety Data
http://rsangole.netlify.com/project/453-text-analytics-class-project-aviation-safety-data/
Tue, 27 Feb 2018 00:00:00 +0000http://rsangole.netlify.com/project/453-text-analytics-class-project-aviation-safety-data/Phase 1 Report What’s the Objective? Quick look at the data View the data Observations What does the raw text tell us? Visualise the words Topic Modeling using LDA Vizualing the results Next Steps Phase 1 Report knitr::opts_chunk$set(cache=TRUE) library(knitr) library(wordcloud2) library(kableExtra) library(topicmodels) library(tm) library(tidyverse) library(magrittr) library(tidytext) library(ggplot2) library(dplyr) library(tidyr) library(gridExtra) load('453_safety_project/training_data.RData') load('453_safety_project/training_labels.RData') What’s the Objective? This is the dataset used for the SIAM 2007 Text Mining competition.Books I Reference
http://rsangole.netlify.com/project/books-i-reference/
Tue, 13 Feb 2018 00:00:00 +0000http://rsangole.netlify.com/project/books-i-reference/The full list of the books in my shelf is on my Goodreads account 1. The ones I refer to the most are listed here:
Deep Learning Deep Learning with R Francois Chollet Handbook Of Neural Computing Applications Alianna J Maren Deep Learning Ian Goodfellow LSTM with Python Jason Brownlee GLM Generalized Additive Models: An Introduction with R, Second Edition Simon Wood Applied Regression Modeling Iain Pardoe Generalized Linear Models John P.First foray into Shiny
http://rsangole.netlify.com/post/first-foray-into-shiny/
Sat, 27 Jan 2018 00:00:00 +0000http://rsangole.netlify.com/post/first-foray-into-shiny/Visualising Distributions Visualising Linear Discriminant Analysis Shiny had interested me for a while for it’s power to quickly communicate and vizualise data and models. I hadn’t delved into it due to lack of time to do so, until now.
Two quick visualizations I’ve created as my 1st foray into R Shiny. Nothing earth shattering, but was helpful to learn the tool.
Visualising Distributions Hosted on shinyapps for free, at link Github code herePerformance Benchmarking for Dummy Variable Creation
http://rsangole.netlify.com/post/dummy-variables-one-hot-encoding/
Wed, 27 Sep 2017 00:00:00 +0000http://rsangole.netlify.com/post/dummy-variables-one-hot-encoding/Motivation Why do we need dummy variables? Ways to create dummy variables in R stats package dummies package dummy package caret package Performance comparison Smaller datasets Large datasets Conclusion Qs Motivation Very recently, at work, we got into a discussion about creation of dummy variables in R code. We were dealing with a fairly large dataset of roughly 500,000 observations for roughly 120 predictor variables. Almost all of them were categorical variables, many of them with a fairly large number of factor levels (think 20-100).Pur(r)ify Your Carets
http://rsangole.netlify.com/post/pur-r-ify-your-carets/
Sun, 17 Sep 2017 00:00:00 +0000http://rsangole.netlify.com/post/pur-r-ify-your-carets/The motivation An example using BostonHousing data Load libs & data Create a starter dataframe Select the models Create data-model combinations Solve the models Extract results In conclusion tl;dr: You’ll learn how to use purrr, caret and list-cols to quickly create hundreds of dataset + model combinations, store data & model objects neatly in one tibble, and post process programatically. These tools enable succinct functional programming in which a lot gets done with just a few lines of code.Finite Mixture Modeling using Flexmix
http://rsangole.netlify.com/post/finite-mixture-modeling-using-flexmix/
Wed, 01 Feb 2017 00:00:00 +0000http://rsangole.netlify.com/post/finite-mixture-modeling-using-flexmix/Model Based Clustering Quick EDA Model building Mixtures of Regressions Quick EDA Model Building Results Further investigation Notes References This page replicates the codes written by Grun & Leish (2007) in ‘FlexMix: An R package for finite mixture modelling’, University of Wollongong, Australia. My intent here was to learn the flexmix package by replicating the results by the authors.
Model Based Clustering The model based clustering on the whiskey dataset.Factor Analysis of Personality Traits
http://rsangole.netlify.com/project/factor-analysis-of-personality-traits/
Sat, 03 Sep 2016 00:00:00 +0000http://rsangole.netlify.com/project/factor-analysis-of-personality-traits/Background Objective Duplication of the Survey Results What does the factor analysis tell us? Conclusion How many factors to select? Where’s the R code? Background In the course Predict-410: Linear Regression & Multivariate Analyses, taught by the excellent Prof Srinivasan, we were taught Factor Analysis (FA). FA is a technique used to identify ‘latent’ or ‘hidden’ factors common to a larger pool of observable or measurable variables. These factors would cause the measurable variables to behave the way they do.