Create multilevel directory with Azure File SDK

In the SDK there’s a function named CreateIfNotExistsAsync() that allows you to create a new directory on the fly. Ie you have directory named root and under it you want to create a directory named 20190220 you can do:

var directoryReference = rootDirReference.GetDirectoryReference(“root/20190220”);

await directoryReference.CreateIfNotExistsAsync();

If, however, you wanted a directory structure like /root/2019/02/20 it cannot be created in one go – Azure will return a HTTP 404. Instead you have to create each subdirectory in turn, which I don’t think is quite the expected behavior.

A solution can be found at the botton of this post, courtesy of Sven Amann, using a recursive function that you might put in your code as a local function:

var directoryReference = rootDirReference.GetDirectoryReference(“root/2019/02/20”);

CreateRecursiveIfNotExists(directoryReference);

void CreateRecursiveIfNotExists(CloudFileDirectory directory)
                     {
                         if (!directory.Exists())
                         {
                             CreateRecursiveIfNotExists(directory.Parent);
                             directory.Create();
                         }
                     }

The quirkyness of DB2 TIME values

Being and old MS Sql Server hack, I’ve recently ventured into DB2 land. Many things seem pretty straightforward coding DB2 Sql, but the intricacies of the TIME value are really hard to get my brain around.

So, TIME columns contain just that – a time of day – but I think the supporting methods are really lacking and quite awful.

For instance, when you subtract two TIME values the result is not another TIME, but instead a decimal containing the difference in HHMMSS format, ie:  CAST(’06:51:56′ AS TIME) – CAST(’05:31:10′ AS TIME) results in the decimal value 12046! Can you then use 12046 for anything? Well, not really. It can’t be cast back into a TIME. The CAST method only support casting a character value to a time…

My requirement was for calculating the difference between two TIME columns (here K18UKLOKKORBEGIN and K18UKLOKKOREND respectively)  in seconds and after much googling this is the solution I came up with:

(HOUR(K18UKLOKKOREND) * 3600 + MINUTE(K18UKLOKKOREND) * 60 + SECOND(K18UKLOKKOREND)) –
(HOUR(K18UKLOKKORBEGIN) * 3600 + MINUTE(K18UKLOKKORBEGIN) * 60 + SECOND(K18UKLOKKORBEGIN))
AS “SecondsDiff”

To calculate the same thing as a TIME column, this was about as good as it goes:

TIME(’00:00:00′) + ((K18UKLOKKOREND – K18UKLOKKORBEGIN)/10000) HOURS + MOD((K18UKLOKKOREND – K18UKLOKKORBEGIN)/100, 100) MINUTES + MOD((K18UKLOKKOREND – K18UKLOKKORBEGIN), 100) SECONDS
AS “TimeDiff”

Of course everybody is probably creating user defined functions for this, but why isn’t that built into what should by now supposedly be a mature product?

VS Shell installation has failed with exit code 1638.

I want to recount this experience from the trenches that can perhaps be of use to others:

Setting up a new dev box with to install Sql Server 2017 (and even 2016), a problem with the message “VS Shell installation has failed with exit code 1638.” haunted me for several hours while.

Apparently it occurred because I had installed Visual Studio 2017 first, which in turn installs Microsoft Visual C++ 2017.

So I cleaned out both the Visual Studio installation and the Sql Server installation. The latter had disintegrated in mid-flight and left a lot of debris that Revo Uninstaller helped clear away. Many items had to be uninstalled separately, in particular Microsoft Visual C++ 2017.

After this thorough cleaning up, my Sql server installation finally managed to succeed…

It really seems odd that the Sql Server installation should not play nice with C++ 2017, but “so it goes” as Vonnegut famously said.

Lenovo Z500 Windows 10 Backlight Problem

Seems that lots of people have had problems with not being able to adjust the screen backlight…

The solution I (or rather my daughter Smile) found was – while booting your PC – to hold down the Fn-key and then adjust the lighting with the adjustment keys (F11 & F12).

Importing SAS data into PowerBI

Data analysis without visualization is almost useless and to many (end-)users Microsoft Power BI is just the easiest to use, freely available tool around. While the built-in PowerQuery data access feature provides access to a lot of different types of data sources, SAS datasets is, however, not one of them.

But while there is no built in native adapter for SAS data sets in Power BI (which a good many people seem  to be asking for) the “R script” adapter can come in quite handy as a workaround!

For R there are a couple of packages which will read SAS data:  “sas7bdat” and “haven”. It seems that “sas7bdat” is quite slow and as “haven” is the choice in RStudio for importing SAS data, I took that as a seal of approval and went with “haven”.

When “R script” is selected as the data source in PowerBI a script window pops up like so:

image

Here are the few lines of script for my sample:

if (!require(“haven”)) {
   install.packages(“haven”)
}

require(haven)
data_file <- “c:/Users/peh/Documents/SAS/aregn03.sas7bdat”;
sasdata <- read_sas(data_file, catalog_file = NULL, encoding = NULL)

Simply edit the SAS file name (“data_file” – remember to fix the slashes in the path name to R-standard) in the script to suit your own file-path and press OK. Now the Power BI navigator will ask which of the R dataframes in your script to import. Here there is only one (“sasdata”) so select that and click OK and your data will be loaded into the Power BI data model.

The usual memory limitation of R will of course apply to the SAS data being loaded, but in most of cases I think this method will be quite sufficient.

Of course you could export your data from SAS itself to a variety of PowerBI compatible formats, or use R to convert it to ie CSV before loading into Power BI, but I think the above method is quite convenient if you’re a SAS shop and an easy template to communicate even to people without prior knowledge of either SAS or R.

Doing a case sensitive REPLACE in T-Sql

Oh, the wonder of Scandinavian letters (and German, and French, and Dutch) … Well, I’ve just had to do a case sensitive text replacement on my Sql-box, which uses the otherwise case insensitive collation “Danish_Norwegian_CI_AS”.

Fortunately the REPLACE function can take a collation as part of its string expression so my letter substitution ended up like so:

	UPDATE [OEDBAdmin].[dbo].[KontrolRapporter] SET Navn = 
		REPLACE(
			REPLACE(
				REPLACE(
					REPLACE(
						REPLACE(
							REPLACE(
								REPLACE(
									Navn COLLATE Danish_Norwegian_CS_AS,'øe','oe'),
									'æ' COLLATE Danish_Norwegian_CS_AS,'ae'),
									'ø' COLLATE Danish_Norwegian_CS_AS,'oe'),
									'å' COLLATE Danish_Norwegian_CS_AS,'aa'),
									'Æ' COLLATE Danish_Norwegian_CS_AS,'Ae'),
									'Ø' COLLATE Danish_Norwegian_CS_AS,'Oe'),
									'Å' COLLATE Danish_Norwegian_CS_AS,'Aa')

Importing a file into a Sql Server varbinary column

 

Previously I’ve mostly achieved this by using Powershell, but when the file you want to import resides on the Sql server itself (or somewhere accessible to it) it can actually be done with just a few T-SQL statements.

However I need to dynamically name the the file to be imported, and that takes an extra hoop to jump through because the OPENROWSET doesn’t take variable names as a parameter for the file name. There are a good deal of suggestions on the web on how to do this, but I favour simplicity so my solution really tries to cut it down to the bare necessities.

First i created a “Documents” table for storing my files:

CREATE TABLE [dbo].[Documents](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Date]  AS (getdate()),
    [FileName] [varchar](512) NOT NULL,
    [DocumentData] [varbinary](max) NOT NULL,
CONSTRAINT [PK_Documents] PRIMARY KEY CLUSTERED
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

 

Then the import function:

DECLARE @FileName VARCHAR(200) = ‘e:\Wheat_vehicles_combine_harvester_farming_john_deere_1920x1200.jpg’;

DECLARE @sql varchar(max) = ‘SELECT ”’ + @filename + ”’,* FROM OPENROWSET(BULK N”’ + @fileName + ”’, SINGLE_BLOB) AS RS’;

INSERT INTO PERH..Documents (FileName, DocumentData)
    EXEC(@sql);

That is really all there is to it – happy coding Smile

PS. Checking for file existence, deleting source file after loading etc can be achieved via the extended stored procedures on your system such as xp_fileexist, xp_delete_file.

Create PDF from Sql with Powershell

Saving the results of a query to PDF would be a neat feature to have in SSMS – but second best is to have a small Powershell script to assist you.

I came across this article which contains a Powershell Module by Dr. Tobias Weltner that nicely wraps  PdfSharp-WPF.dll and then it was jus a matter of processing my own Sql (in this case a list of tables/columns) like so:

Import-Module E:\PowerShell\PDF\PDFTools

$connection = New-Object System.Data.SqlClient.SqlConnection("Data Source=LC7207SQL;Database=Watson; Integrated Security=true;”)
$query =
    “
    SELECT TABLE_NAME, COLUMN_NAME, DATA_TYPE, CHARACTER_MAXIMUM_LENGTH
    FROM INFORMATION_SCHEMA.COLUMNS
    ORDER BY TABLE_NAME, ORDINAL_POSITION
    ”
$adapter = New-Object System.Data.SqlClient.SqlDataAdapter ($query, $connection)
$table = New-Object System.Data.DataTable
$adapter.Fill($table) | out-null

@($table) | Out-PTSPDF -Path $env:temp\report.pdf -Open -A -GroupBy TABLE_NAME

There are a number of parameters sto play with on the Out-PTSPDF module, and if required you can also pipe your data through a formatter, like Format-List, before PDF’ing it.

RegEx in R to extract values between square brackets

Except for RegEx always being a pain, one would think that parsing a string in R is a piece of cake. Well, it is – sort of…

So here goes:

install.packages("stringr")
library(stringr)

formulas <- c(
    "[000275]-[007825]-[007826]-[007827]",
    "[bru]-[007303]-[007301]-[007305]-[007307]"
    )

variables <- regmatches(formulas, gregexpr("\\[.+?\\]", formulas))

Getting R to run on Sql Server 2016

 

I have slowly been moving into the R space for a couple of years, and see it as both an exciting DSL as well as a journey from Sql and C# experience into the realm of serious data analysis and data science (and visualization).

It is a giant contribution that Microsoft is making by making the results of it’s Revolution Analytics acquisition more or less freely available to the world. I was able to attend the EARL conference on R applications in London some weeks back, and it was an exciting view into the world of analytics!

Today I got around to fire up some R on an Sql 2016, with an EARL session by Adam Rich as my starting point. It was somewhat laborious and the documentation and samples around the web were really convoluted or sometimes plain misleading. I really had to cut through a lot of extraneous information and so for future reference, here is what I did.

I had already installed Microsoft R and RStudio (and VS R tools, though I didn’t use them for this) as well as Sql Server 2016 with R, but at first I just wanted to check out the server with a very simple script:

EXEC sp_execute_external_script  @language =N’R’,   
    @input_data_1 = N’SELECT 42 AS TheAnswer’,
    @script = N’OutputDataSet <- InputDataSet’
    WITH RESULT SETS (([TheAnswer] int not null));

This failed with the message:

‘sp_execute_external_script’ is disabled on this instance of SQL Server. Use sp_configure ‘external scripts enabled’ to enable it.

The answer to this was to run these commands and then to restart the database service:

sp_configure ‘external scripts enabled’, 1; 
RECONFIGURE; 

 

Now it was time to try my hand at an R script that I would write locally on my PC and run remotely on the Sql Server with a trusted connection – in R parlance in a remote compute context – pulling a bit of data. First, however, the Microsoft R Client had to be installed – the version I had apparently didn’t  include the MRO stuff required to connect to the server – you can check the “c:\Program Files\Microsoft\MRO”  directory to verify that you have both an “MRO” and “R Client” directory.

A major problem was this message when setting up the server-connection (RxInSqlServer):

Error in .rxEnsureOdbcCredentials(connectionString, server, databaseName,  : 
  Trusted_Connection or User must be specified either in the connection string with the uid keyword or as a named parameter.
 

Most samples included a connectionstring for the RxInSqlServer-function, but this seems to be unnecessary.  I had to fiddle quite a bit with the parameters for the various special R server functions, ie those prefixed with RX. Also note that sometimes the prefix is Rx and sometimes rx… It appears that 2 connections are necessary: one for the R server computing context and one when executing Sql in the R context. But, to cut a long story short, this is what I got to work as a sort of minimum solution:

# Set up a trusted connection to R in Sql 2016

sqlCompute <- RxInSqlServer(
    server = “LC_SQL”,
    #wait = TRUE,
    #consoleOutput = FALSE,
    #shareDir = “e:/”,

    user = “Trusted_Connection”
    )

rxSetComputeContext(sqlCompute) # Switch script execution to Sql server

sqlConnString <- “Driver=Sql Server; Server=localhost; Database=master; Trusted_Connection = True”;

sqldata <- RxSqlServerData(sqlQuery = “select name, database_id from sys.databases”, connectionString = sqlConnString)
rxGetVarInfo(data = sqldata) # Show our variables
myframe <- rxImport(inData = sqldata) # Get data as a dataframe
print(myframe)
rxSetComputeContext(RxLocalSeq()) # Switch script execution back to local machine

 

Hope this can help someone struggling with the same challenges Smile