The odyssey to Azure SQL has been difficult and I fear it is not over, hence the preemptive "Part 1". The goal is and has been for some time to move to the cloud. For me, an enterprise developer, ironically this is not about scaling. It is about supporting legacy systems but keeping the technology up to date when possible. The bread and butter of Enterprise development (for this Borg) has always had an MS SQL Server at the center of it wrapped by a constellation of .NET stuff.
To get this series of posts started, lets outline the task of moving databases off on prem MS SQL Server and onto Azure SQL. Azure SQL DB feels right in that it is quick to setup compared to an Azure SQL Managed Instance, minutes compared to hours. Also, the networking is simpler. And the price is right.
First thing I noticed was we have no Linked Server feature in either of these options. How am I going to connect to Oracle databases? The simplest option I could find was to use the EXTERNAL DATA SOURCE feature which can read from either an SQL Server or Azure Blob Storage. Then you can use BULK INSERT to load directly into a table or use OPENROWSET to SELECT the data.
https://learn.microsoft.com/en-us/sql/relational-databases/import-export/import-bulk-data-by-using-bulk-insert-or-openrowset-bulk-sql-server?view=sql-server-ver16
First thing to consider is we want no public access on the Azure SQL DB or the Azure Blob Storage. We want only private access defined by a private endpoint. I tried many different scenarios that failed but the one that worked was turning on the Managed Identity for the Azure SQL DB and assigning the IAM role Storage Blob Data Reader to the Azure SQL DB Managed Identity. It seems that the only way to get the Azure SQL DB to use the private endpoint was to use Managed Identity. Shared Access Signature access would not work with public network access disabled on the Azure Storage. On the Azure SQL DB run this SQL to create a Managed Identity credential with the name ManagedIdentityCredential:
CREATE DATABASE SCOPED CREDENTIAL ManagedIdentityCredential
WITH IDENTITY = 'Managed Identity'
Then create the external data source:
CREATE EXTERNAL DATA SOURCE MyBlobStorage
WITH
( LOCATION = 'https://mystorage.blob.core.windows.net/mycontainer' ,
CREDENTIAL = ManagedIdentityCredential ,
TYPE = BLOB_STORAGE
) ;
This is the OPENROWSET query:
SELECT *
FROM OPENROWSET(
BULK 'datafolder/datafile.csv',
DATA_SOURCE = 'MyBlobStorage',
FORMAT = 'CSV',
FIRSTROW = 2,
FORMATFILE = 'dataformats/dataformat.xml',
FORMATFILE_DATA_SOURCE = 'MyBlobStorage'
) AS DataFile;