Learning R

Today I spent some time learning the R language.

The problem I was trying to solve was to convert local prices of some items into Euros. I had been using a fixed exchange rate for all data, but as exchange rates fluctuate so much this is incorrect.

My first though was to find a free API that I could query to get the values I wanted. The first API I found didn’t cover all the currencies, the next one I found I burnt through the free allowance in one pass.

A colleague of mine mentioned using R to solve this, he sent me some links and I set out to write my first piece of R code.

My finished code can be found on github and I will attempt to explain some of it.

R defines functions fairly simply

nameoffunction <- function(arg1, arg2)
{
arg1 * arg2
}

I have created a function that takes 2 parameters date and currency. I know I have about 10 different currencies that I want to get currencies for and I want to loop through each day so I will need to pass in a date.

The source of my exchange rate information is the www.xe.com website, its historical exchange rate page passes currency and date into the query string so I should be able to build up a string containing all the different elements.

All programming language can concatenate strings and R is no different. It uses paste()

var <- paste(“Hello”, “World”)

However R has an annoying feature in this function. I would expect that var in the above example would contain “HelloWorld”, it doesn’t it contains “Hello World”. Why it automatically adds a space I don’t know?

var <- paste(“Hello”, “World”, sep=””)

I am not entirely sure what all of the code does but I can take a good guess.

read_html() I would guess loads a html page, html_nodes() finds all the html tags of a certain type on the page, in my case <table>, html_table() reads the first table it finds.

table1[2] selects the second column, and head() selects a specific number of rows. I want the first row and second column so I combine these two as head(table1[2],1)

Now that I have found my exchange rate what do I do with it? R can read and write to SQL Server so why not store this info in a SQL lookup table. I can then use this data in a stored procedure when I process my data.

To query sql you can use sqlQuery(), it has two parameters, a sql connection and a TSQL command (eg a SELECT, INSERT, UPDATE statement)

I use a while loop to loop through every day between 1st October 2016 and today and look up the exchange rate for each currency.

For now I am manually running this R script, however there are ways to run R directly from SQL Server which I may well investigate. I could then have a SQL job to run this on a schedule, maybe once a day to get the latest exchange rates. I also would like to do something a bit cleverer like only getting exchange rates for the days that I need them by querying existing database tables.

Getting control of a codebase

So recently I started working on a new codebase. I will be honest when I first saw it, it was a mess. Here are a few of the things I did to try and regain control.

I was given access to the source code on Visual Studio Team Services. However this consisted of a single commit 3 months ago. When I looked at what was running on the production server it was clear that changes were being made live with no regard for source control.

The first thing I did was commit everything that was running live into source control.

Next I created a SQL Server Data Tools (SSDT) project to keep track of all the database objects. Previously there was a folder with some stored procedures in it, but these did not match with what was currently running.

SQL ServerI now had in source control the current state of the website and the database, so I knew I could get things back to this state if I made some bad changes.

Lets start by looking at the website code I had. There was no solution file, the only way to look at the website was to setup my local IIS to run what was in the website folder. I could then use Visual Studio to “open” my local IIS website and attach to process to debug it.

Next I Looked at Default.aspx to see how the website worked. The majority of the website code was stored in the database stored procedures. After the tag there was a <% %> which contained a Response.Write(RunSP.RunStoredProcedure(Parameter1, Parameter2, …) command, which executed a stored procedure and the results of the stored procedure was the html code including any javascript that the webpage needed to display. I will be honest I have never seen any code like it. My guess is that the developer was secretly a DBA and wanted to make any web page changes by just changing how the stored procedures work.

This meant that the website is not going to do anything without a backup of the database running, and meant my SSDT project was going to be vital. However the database was in a bad state, it consisted of a fair few broken objects and SSDT would not build.

Using find I went through each of the broken database objects to find where in the code they were being used. Luckily most were referenced in commented out code, so I just removed all the broken database objects. The database could now be built. However there was a dependency on the users table of another database. (This was the developers solution to sharing logins between websites) As I was using SSDT I added a database dependency, problem solved for now.

Next I tried publishing my database. SQL CMD encountered a parsing error.  The reason for this was my SPs contained javascript eg $(document), SQL CMD uses $(DatabaseName) as variables for different database so it was getting itself confused.

My solution was to use Find and Replace to replace all the $ with ‘ + CHAR(36) + ‘

So I now have a SSDT project that builds and publishes but still no website project.

To get the website running from Visual Studio I started off creating a .Net 4 website project and added Entity Framework 5 and MVC 3 via nuget. I then copied all the website code into the new project, and with a bit of work I got it to build. Most of the work was relating to namespaces and referencing the correct one and moving the EF model from AppCode to a custom named folder. A bit of trial and error later I had a version of the website that could be run from Visual Studio.

I have not deployed my new version of the website as it needs further testing. No automated testing or even a smoke test checklist currently exist.

Visual Studio Team ServicesAs my source code is hosted on Visual Studio Team Services (VSTS), I can get VSTS to build each commit and check I haven’t broken the build. This is not that helpful at the moment, hopefully one day I will have automated tests that can be run here as well.

Wow, I feel like I have done loads with this code so far but there is loads more still to do. I need to understand more about the business processes behind the code with a hope to understand why some architectural decisions have been made. I want to refactor the code as much as is possible, I would like to remove much of the html/javascript from the stored procedures as I can’t see that there is any advantage to running a website like this. Please correct my if I am wrong.

Running Windows on Raspberry Pi

Last year you may remember me talking about playing with a Raspberry Pi. Well since then my Raspberry Pi has been sat on a desk collecting dust.

This week I attended Leeds Sharp and the topic was Running Windows on Raspberry Pi and this has inspired me again to do something with a Pi.

But first what did I learn.


I had heard that a cut down version of Windows 10 could be installed on the newer Raspberry Pi’s, but I hadn’t really understood how cut down the version of windows is. Having now seen it demonstrated the OS consists of a single page with a few menu options.

The real power of Windows 10 IoT is when you connect remotely to it. There are a couple of ways to do this, PowerShell (check out https://ms-iot.github.io/content/en-US/win10/tools/CommandLineUtils.htm for a few commands), and of course connecting Visual Studio to your Pi.

When I had previously played with a Pi, it had been with bash scripts and linux commands. The beauty of installing Windows IoT is that you can write c# code, something I do in my day job so theoretically I should find it easier.

hqdefaultThe demonstration at Leeds Sharp was pretty impressive. If you are a fan of the Big Bang Theory you may recall Sheldon playing a Theremin. Well it is actually possible to construct a Theremin from a couple of sensors and a Raspberry Pi. The code for which is on github.

Now that I have been inspired what shall I do?

My Raspberry Pi won’t support Windows 10 IoT, so I need to buy the latest version. I am thinking of buying a kit so I can play about with a breadboard, LEDs and resistors. Maybe not build a robot straight away but certainly try doing something that connects to the GPIO pins.

If you have any suggestions leave a comment below.

SQL Server Management Studio

I use SQL Server Management Studio all the time for writing queries, getting data and general SQL development.

SQL Server Management Studio

I have enjoyed seeing the improvements that each new version of SQL Server Management Studio (SSMS) introduced. One great improvement was intellisense.

This feature saves typing and reduces errors by automatically suggesting tables, column names or other database objects.

A common query that I get asked to write is provide a spreadsheet that gives the information that satisfies certain criteria. This is easy to do in SSMS, you can write the query, click execute and the rows that satisfy the criteria are displayed. These rows can then be easily copy/pasted into Excel or other spreadsheets.

A common data item that gets stored in databases is addresses and addresses often contain line breaks to make the data display better. In the earlier versions of SSMS, when you copied and pasted these line breaks were ignored and the data displayed the same in SSMS as it did in Excel. However in the more recent version, theses line breaks got copied across breaking your spreadsheet and making it hard to see what data corresponded with what.

Now I don’t know if this should be described as introducing a bug or fixing one. I can easily argue both sides. If your data contains a line break and you copy this data it should include the line break in the destination, but if it does that it displays badly in Excel.

The fix I have been using until recently is to use the following TSQL command in my queries.

SELECT REPLACE(REPLACE(@str, CHAR(13), ”), CHAR(10), ”)

This command replaces any line breaks with an empty string. Both Char(10) and Char(13) are needed because you can have different types of line breaks. This is great if you are writing the script from scratch but isn’t great if your are running a stored procedure or your query has a lot of columns.

The answer to this is to use Visual Studio to run your SQL query. In Visual Studio you can write and run queries via Server Explorer and the results produced don’t contain line breaks. I have only just discovered this solution, but so far it has worked and is very easy to do, plus as I do most of my development in Visual Studio anyway it saves me having to open SSMS to test my queries.

Visual Studio

I recently replaced my installation of Visual Studio 2013 with Visual Studio 2015 RC.

I like the new version, I am not a Visual Studio expert so it will probably take me a while to find all the good stuff but here are some initial thoughts.

Being as my MSDN subscription is still valid I have installed the professional version to take advantage of its extra features like CodeLens.

One of the first things I spotted was that the integration with Azure has been improved. In the last version it was difficult to sign in to Azure with the correct credentials, especially if you have more than one Azure account. Now you can add multiple Accounts and Subscriptions.

i1CodeLens is a feature that has been around in Visual Studio since 2013, but in 2015 it is available in Professional meaning more people have access to it.

CodeLens gives coders useful information at a glance. Above each class/method is listed how many references there are. If you click on the number of references you can see where that class or method is referenced in the rest of your code. Useful to be able to see which methods or classes are not being used.

Next CodeLens shows who (according to git) last changed the class or method and how many days ago that was. Clicking on it shows a cool graph of when changes have happened and who did them.

Next you can see the number of changes that have been made, basically a source control history, but without having to load up your git client.

For code that doesn’t contain classes or methods such as T_SQL you can see at the bottom of your code window the last two CodeLens information to help you track down what changes have happened recently.

The last new feature that I have noticed is the Light Bulbs that keep showing up all over my code. I think the Light Bulbs might be called Quick Actions, but whatever they are called they are suggestions on how to improve your code. So far they have suggested to be to get rid of using references that are not used, simplify a fully qualified name, drop unneeded this keywords. I am sure more will popup as I do more coding.

These improvements to Visual Studio I like, and I am sure there are many more that I haven’t noticed. I expect support for the vNext .net framework is also in there somewhere which hopefully I can play around with soon.