Archive

Other

I am currently working in a personal project consisting on a visualization & prediction models of Santander (North of Spain) urban traffic. The city has more than 200 sensors all  around  and you can grab them in “real time” (+2 hours) using their API at datos.santander.es

I am using R, simply because  I want to test new amazing packages such as Flexdashboard. The data pipeline is as follows:

1.- Grab sensor data each 60 minutes using a cron job & R script and save them in SQLite Database

2.- Calculate metrics & present data using RMarkdown & Flexdashboard

3.- Add prediction model to the final output & publish as a HTML website. (pending)

 

Below find some of the current assets (work in progress!!!)

Screenshot from 2016-08-13 09:42:33

Screenshot from 2016-08-07 12:20:55 Screenshot from 2016-08-13 09:42:54

Screenshot from 2016-08-13 09:43:13

Screenshot from 2016-08-13 09:43:33

Screenshot from 2016-08-06 18:06:41

Screenshot from 2016-08-06 18:05:24

Screenshot from 2016-08-07 10:07:01

“useR” is the main R conference throughout the year and I find super interesting the 2016 edition, organised at Stanford University. You can watch/download its resources here.

Watching videos from this year’s edition, I found two packages that got my attention.

The first one is “Flexdashboard” and as its name says, it’s a tool for creating dashboards. Nothing really special taking into account the myriad of tools out there for building dashboards. The thing is that if (like me) you dream about doing pretty much of your work using a single programming environment, having the ability to build dashboards using simple R Markdown and including all nice and lately interactive artifacts built in the R community, like “htmlwidgets” it’s another different thing. Its easiness can low the barrier for entry users which makes it even more appealing.

If the first package is for communicating, the second one is for data storage. SQL data bases are everywhere, they are the most widely used form of storage and they have been around  for decades. There is a long list of SQL options out there, but I like what MonetDBLite does. MonetDBLite it’s fast, very easy to install and what I really like it’s the ability to work with dplyr syntax.

 

Screenshot from 2016-07-17 14:25:58

Screenshot from 2016-07-17 14:26:26

Screenshot from 2016-07-17 14:27:10

When I first started coding in R was due, mainly, to ggplot2. Now that I am building my foundation blocks in Python, I have found two python modules that are similar to ggplot2. One of them is called ggplot, and as you may guessed, it’s completely based on ggplot2 and The Grammar of Graphics, by Leland Wilkinson. The other module is called seaborn, developed at Stanford and it might push a little bit further the ggplot level. I am very used to the ggplot/ggplot2 syntax, but I will definitely use seaborn in the future.

Have Fun!

figure_5

 

I’ve been out for a long time. This year is being crazy, changing job, city, teaching and a lot of hard work.

I’ve also changed my tech stack in the recent months moving back to Linux and focusing more in Python than R.

So many changes have stopped me of doing my usual personal projects, so I do not have much to show, but I can tell you briefly the things that got my attention in the last months:

  • A lot of scraping (edans.com, forbes.com……) with both Python & R
  • Network tables generation (pair tables and property nodes tables using R)
  • Hell a lot of data visualization using the new version of our software Quadrigram
  • Data models for our projects at Bestiario
  • Learning about interesting projects developed using Python and which are under my areas of interest, such as Cubes (OLAP)

I hope to share with you some of these works and hopefully in the near future to be able to post more frequently.