Microsoft Data Platform Stack Overview
Kay Sauter - Apr 12 '20 - - Dev Community
Table Of Contents
Motivation
I vividly remember my very first time years ago when I had to explain a relatively simple business intelligence architecture to some people who never have heard of such architecture before. I've left them confused. Well, I can't blame them for I was talking with too many acronyms like SSAS, SSIS, SSMS, SSRS! During my career, I've learned a lot on how to explain this, but that is still no easy task if you're talking to people who are not familiar with the data world. And because Microsofts data platform family is still growing, this problem doesn't get any easier.
Unfortunately, even though I am very happy with Microsoft's documentation about their data stack, I am yet to find a good overview of all Data Platform Tools which you do not have to click through tons of links. Such one single site would help to get just an overview of the tools and tell concepts to people who are new to this topic.
This is what I aim to do with this post, a simple overview of all tools by Microsoft that somehow are related to the Data Platform Stack. I will focus on software by Microsoft only. Still, because this already contains so many tools, I will probably miss some. If you spot one or you believe that something is wrong, please drop a line in the comments! I will add or correct it and credit you! I also will try to maintain the links and add new tools by Microsoft. This page is also available on Github, so you may create a PR there as well.
On-premises software
Tool Name |
Explanation |
Costs |
SQL Server |
A the relational database solution software by Microsoft which is usually installed on-premises. It also can be installed on Virtual Machines in the cloud (Azure). |
Licensed per CPU, depending on edition and setup |
SQL Server Analysis Services |
SSAS is an analytical data engine that supports both multidimensional and tabular models. In short, it sort of aggregates and calculates data for analytical purposes. Usually, it consumes data from a data warehouse. |
Licensed per CPU, depending on edition and setup |
SQL Server Integration Services |
SSIS is the SQL Server feature that ingests and transforms data that are to be loaded into a database, oftentimes to a staging database or a data warehouse. |
Licensed per CPU, depending on edition and setup |
SQL Server Reporting Services |
SSRS is an on-premises service which creates reports and dashboard which may be scheduled and subscribed by users. The focus of this solution is paginated reports. |
Licensed per CPU, depending on edition and setup |
Excel |
Excel is a very powerful spreadsheet software that can present analyze and present data and is worldwide the most widely used software for this purpose. It can be connected to a SQL Server or SSAS or numerous other data sources. |
Part of the MS Office package, different editions available |
Power BI Premium |
A business analytics solution for dashboarding and other reports. The focus of Power BI is non-paginated reports. Its idea is to collaborate and share reports to analysts and other stakeholders with ease. It also can be used on mobile devices, hence having the focus on non-paginated reports and dashboards. It also contains SSRS. |
Subscription based |
Machine Learning Services |
A SQL Server Feature that provides abilities to use the programming languages R or Python for Machine Learning that can be executed directly on SQL Server. |
Licensed per CPU, depending on edition and setup |
Master Data Services |
Manages the Master Data sets which also has something like version control of the master data. It makes it easier to ensure that the master data is as clean as possible. |
Licensed per CPU, depending on edition and setup |
Cloud-based: Azure tools
Tool Name |
Explanation |
Costs |
Azure SQL |
Microsofts SQL Servers pendant in the cloud. Its strengths rely on Azures' versatility and the possibility for elasticity and scaling-up or scaling out and are fully managed by Microsoft. |
Depends on various tier plans which depend on CPU, memory, and location. |
Azure Data Factory |
The Azure data integration solution (ETL & ELT) while being managed by Microsoft. It can also use SSIS packages. |
Depends on several factors like service type, tiers, etc. |
Azure Cosmo DB |
Fully managed by Microsoft, is globally distributed for high availability and high speed and can use SQL and NoSQL alike. It also has elastic and automatic scaling. |
Depends on various tier plans which depend on CPU, memory, and location. |
Azure Analysis Services |
A fully managed analysis service for tabular models on Azure. It is fully managed (Platform-as-a-service) by Microsoft. It does not support multidimensional models, neither PowerPivot for SharePoint. |
Depends on various tier plans which depend on CPU, memory, and location. |
Azure Machine Learning |
Microsofts ML Service on Azure, which features a neat GUI which enables hardcore data scientists and mere mortal business analysts alike. The latter group may use pre-defined packages that were written by experts so that its usage is fairly easy. |
Depends on various tier plans which depend on CPU, memory, and location. |
Azure Synapse Analytics |
Specialised analytics solution for data warehouses and big data for extreme speed. It can work with non-relational data and "traditional SQL" alike; it also provides possibilities for monitoring and data visualization with Power BI. |
Depends on various tier plans which depend on CPU, memory, and location. |
Tabular overview from ETL to frontend
Development tools
Tool Name |
Explanation |
Costs |
SQL Server Management Studio |
SSMS is a Software to manage and query your SQL Server instances and databases. |
Free |
Azure Data Studio |
ADS enables to manage and query SQL Server instances and databases with a focus on Azure. It is not as powerful as SSMS but has other strengths than SSMS, i.e. it can use Jupyter Notebooks. |
Open Source & free |
Visual Studio |
An Integrated Development Environment which is used for developing SSAS and SSIS solutions, but also can develop databases although the latter is not so commonly used that way. Usually, SQL Server developers would prefer SSMS. |
Community Edition free, other editions not |
SQL Server Data Tools |
SSDT which is a toolset living in Visual Studio. It enables you to create workflows for SSIS and create cubes (multidimensional and tabular for SSAS) and does not require a higher edition than community edition. SSDT exists as standalone software (shell) or within Visual Studio as an add-on. |
Free |
Visual Studio Code |
A lightweight editor that allows installing an addon-for SQL Server for. It enables you to write a T-SQL Code similar to Azure Data Studio. |
Open Source & free |
Extended Events |
Enables to monitor (tracing) of events on SQL Servers, etc. for troubleshooting or solving performance bottlenecks. |
Free (a feature of SQL Server, no extra costs for extended events) |
SQL Server Report Builder |
This is a report builder for SSRS. It looks pretty much like Visual Studio minus the complexity Visual Studio gives because it is a complete and powerful IDE (Integrated Development Environment). |
Free |
Power BI Desktop |
Software for creating dashboards and reports that may be connected to various sources, including SQL Server, SSAS and a lot more. Allows writing code in R and Python. |
Free |
Programming languages used in Data Platform Stack
Language Name |
Explanation |
T-SQL |
Transact-SQL, the Structured Query Language variant for SQL Server which offers a rich and powerful automatization language for querying and administration of SQL Server |
PowerShell |
A shell language for administring Windows, but there are also many important packages for the Data Platform Stack. |
Python |
A widely-used general-purpose scripting language which is very popular for the use of Machine Learning and Data Science. SQL Server can execute Python directly. It can also be used in PowerBI Desktop or Azure ML. |
R |
A very popular functional language which focuses on statistical calculations and is helpful in Machine Learning and Data Science. SQL Server can execute R directly. It can also be used in PowerBI Desktop or Azure ML. |
DAX |
Data Analysis Expressions is a functional language used for calculations on SSAS. It is used in SSMS, Visual Studio, PowerBI Desktop and Excel. |
MDX |
Multidimensional Expressions allows us to query multidimensional cubes on SSAS. |
Cover photo by Kolar.io on Unsplash