Microsoft sql server 2014 business intelligence development beginners guide free.Microsoft SQL Server 2014 Business Intelligence Development: Beginner’s Guide by Reza Rad

Looking for:

Microsoft sql server 2014 business intelligence development beginners guide free

Microsoft sql server 2014 business intelligence development beginners guide free.Robot or human?

The data warehouse is an integrated dimensional data structure. Data from a variety of sources will be fed into the data warehouse and some data quality and governance would be applied on the data.

The dimensional model of data warehousing is optimized for reporting and analysis, so data visualization tools can directly query against the data warehouse. These models will improve data access in terms of speed and performance of queries. BI systems have one or more data visualization frontends that will be the GUI for the end user. In this book, we will go through the BI architecture and explore the Microsoft technologies that can implement and deliver BI solutions.

As the first steps, a developer needs to design the data warehouse DW and needs an understanding of the key concepts of the design and methodologies to create the data warehouse.

Chapter 4, ETL with Integration Services , describes how ETL is an operation of transferring and integrating data from source systems into the data warehouse. ETL needs to be done on a scheduled basis. Chapter 5, Master Data Management , guides readers on how to manage reference data. Chapter 6, Data Quality and Data Cleansing , explains that data quality is one of the biggest concerns of database systems. The data should be cleansed to be reliable through the data warehouse.

In this chapter, readers will learn about data cleansing and how to use Data Quality Services DQS , which is one of the new services of SQL Server , to apply data cleansing on data warehouse. In this chapter, readers will understand data mining concepts and how to use data mining algorithms to understand the relationship between historical data, and how to analyze it using Microsoft technologies.

In this chapter, readers will become familiar with algorithms that help in prediction, and how to use them and customize them with parameters.

Users will also understand how to compare models together to find the best algorithm for the case. Chapter 9, Reporting Services , explores Reporting Services, one of the key tools of the Microsoft BI toolset, which provides different types of reports with charts and grouping options. Chapter 10, Dashboard Design , describes how dashboards are one of the most popular and useful methods of visualizing data.

In this chapter, readers will learn when to use dashboards, how to visualize data with dashboards, and how to use PerformancePoint and Power View to create dashboards. Chapter 11, Power BI , explains how predesigned reports and dashboards are good for business users, but power users require more flexibility. Power BI is a new self-service BI tool.

Chapter 12, Integrating Reports in Applications , begins with the premise that reports and dashboards are always required in custom applications.

NET applications in web or Metro applications to provide reports on the application side for the users. However, you can also download and install MS SQL Server Evaluation Edition, which has the same functionalities but is free for the first days, from the following link:. There are many examples in this book and all of the examples use the following databases as a source:. After downloading the database files, open SQL Server Management Studio and enter the following scripts to create databases from their data files:.

This book is very useful for BI professionals consultants, architects, and developers who want to become familiar with Microsoft BI tools. It will also be handy for BI program managers and directors who want to analyze and evaluate Microsoft tools for BI system implementation. Instructions often need some extra explanation so that they make sense, so they are followed with:. This heading explains the working of tasks or instructions that you have just completed.

You will also find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning. Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: Expand the Chapter 02 SSAS Multidimensional database and then expand the dimensions. New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: On the Select Destination Location screen, click on Next to accept the default destination.

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop. Open navigation menu. Ships to:. Get Rates. Sales Tax for an item Seller collects sales tax for items shipped to the following states:. State Sales Tax Rate. Return policy. Refer to eBay Return policy opens in a new tab or window for more details. You are covered by the eBay Money Back Guarantee opens in a new tab or window if you receive an item that is not as described in the listing.

Payment details. Payment methods. Courtney Bishop. Visit store Contact. Popular categories from this store. Using an integer as a surrogate key also speeds up the join between a fact and a dimension because join and criteria will be based on the integer that operators works with, which is much faster than a string. If you are thinking about adding comments in this made by a sales person to the sales transaction as another column of the Fact table, first think about the analysis that you want to do based on comments.

No one does analysis based on a free text field; if you wish to do an analysis on a free text, you can categorize the text values through the ETL process and build another dimension for that. Then, add the foreign key-primary key relationship between that dimension to the Fact table. The customer's information, such as the customer name, customer job, customer city, and so on, will be stored in this dimension. You may think that the customer city is, as another dimension, a Geo dimension.

But the important note is that our goal in dimensional modeling is not normalization. So resist against your tendency to normalize tables. For a data warehouse, it would be much better if we store more customer-related attributes in the customer dimension itself rather than designing a snow flake schema.

The following diagram shows sample columns of the DimCustomer table:. The DimCustomer dimension may contain many more attributes. The number of attributes in your dimensions is usually high. Actually, a dimension table with a high number of attributes is the power of your data warehouse because attributes will be your filter criteria in the analysis, and the user can slice and dice data by attributes.

So, it is good to think about all possible attributes for that dimension and add them in this step. As we've discussed earlier, you see attributes such as Suburb , City , State , and Country inside the customer dimension. This is not a normalized design, and this design definitely is not a good design for a transactional database because it adds redundancy, and making changes won't be consistent.

However, for the data warehouse design, not only is redundancy unimportant but it also speeds up analytical queries and prevents snow flaking. The CustomerKey is the surrogate key and primary key for the dimension in the data warehouse.

The CustomerKey is an integer field, which is autoincremented. It is important that the surrogate key won't be encoded or taken as a string key; if there is something coded somewhere, then it should be decoded and stored into the relevant attributes. The surrogate key should be different from the primary key of the table in the source system. There are multiple reasons for that; for example, sometimes, operational systems recycle their primary keys, which means they reuse a key value for a customer that is no longer in use to a new customer.

CustomerAlternateKey is the primary key of the source system. It is important to keep the primary key of the source system stored in the dimension because it would be necessary to identify changes from the source table and apply them into the dimension. The primary key of the source system will be called the business key or alternate key. The date dimension is one of the dimensions that you will find in most of the business processes.

There may be rare situations where you work with a Fact table that doesn't store date-related information. This is obvious as you can fetch all other columns out of the full date column with some date functions, but that will add extra time for processing. So, at the time of designing dimensions, don't think about spaces and add as many attributes as required. The following diagram shows sample columns of the date dimension:. It would be useful to store holidays, weekdays, and weekends in the date dimension because in sales figures, a holiday or weekend will definitely affect the sales transactions and amounts.

So, the user will require an understanding of why the sale is higher on a specific date rather than on other days. You may also add another attribute for promotions in this example, which states whether that specific date is a promotion date or not.

The date dimension will have a record for each date. The table, shown in the following screenshot, shows sample records of the date dimension:. As you can see in the records illustrated in the preceding screenshot, the surrogate of the date dimension DateKey shows a meaningful value.

This is one of the rare exceptions where we can keep the surrogate key of this dimension as an integer type but with the format of YYYYMMDD to represent a meaning as well. In this example, if we store time information, where do you think would be the place for time attributes?

Inside the date dimension? Definitely not. The date dimension will store one record per day, so a date dimension will have records per year and records for 10 years. However, 5 million records for a single dimension are too much; dimensions are usually narrow and they occasionally might have more than one million records. So in this case, the best practice would be to add another dimension as DimTime and add all time-related attributes in that dimension.

The following screenshot shows some example records and attributes of DimTime :. Usually, the date and time dimensions are generic and static, so you won't be required to populate these dimensions through ETL every night; you just load them once and then you could use them.

I've written two general-purpose scripts to create and populate date and time dimensions on my blog that you can use. The product dimension will have a ProductKey , which is the surrogate key, and the business key, which will be the primary key of the product in the source system something similar to a product's unique number.

The product dimension will also have information about the product categories. Again, denormalization in dimensions occurred in this case for the product subcategory, and the category will be placed into the product dimension with redundant values. However, this decision was made in order to avoid snow flaking and raise the performance of the join between the fact and dimensions. We are not going to go in detail through the attributes of the store dimension.

The most important part of this dimension is that it can have a relationship to the date dimension. For example, a store's opening date will be a key related to the date dimension. This type of snow flaking is unavoidable because you cannot copy all the date dimension's attributes in every other dimension that relates to it.

He has a Bachelor's degree in Computer Engineering. He has worked with large enterprises around the world and delivered highquality data warehousing and BI solutions for them.

They are also good for aggregations and dashboards. The Snapshot Fact tables provide a very fast response for dashboards and aggregated queries, but they don't cover detailed transactional records.

Based on your requirement analysis, you can create both kinds of facts or only one of them. There is also another type of Fact table called the accumulating Fact table.

This Fact table is useful for storing processes and activities, such as order management. You can read more about different types of Fact tables in The Data Warehouse Toolkit , Ralph Kimball , Wiley which was referenced earlier in this chapter. We've explained that Fact tables usually contain FKs of dimensions and some measures.

However, there are times when you would require a Fact table without any measure. These types of Fact tables are usually used to show the non-existence of a fact. For example, assume that the sales business process does promotions as well, and you have a promotion dimension. So, each entry in the Fact table shows that a customer X purchased a product Y at a date Z from a store S when the promotion P was on such as the new year's sales. This Fact table covers every requirement that queries the information about the sales that happened, or in other words, for transactions that happened.

However, there are times when the promotion is on but no transaction happens! This is a valuable analytical report for the decision maker because they would understand the situation and investigate to find out what was wrong with that promotion that doesn't cause sales. So, this is an example of a requirement that the existing Fact table with the sales amount and other measures doesn't fulfill.

This Fact table doesn't have any fact or measure related to it; it just has FKs for dimensions. However, it is very informative because it tells us on which dates there was a promotion at specific stores on specific products. We call this Fact table as a Factless Fact table or Bridge table. Using examples, we've explored the usual dimensions such as customer and date. When a dimension participates in more than one business process and deals with different data marts such as date , then it will be called a conformed dimension.

Sometimes, a dimension is required to be used in the Fact table more than once. For example, in the FactSales table, you may want to store the order date, shipping date, and transaction date. All these three columns will point to the date dimension. In this situation, we won't create three separate dimensions; instead, we will reuse the existing DimDate three times as three different names. So, the date dimension literally plays the role of more than one dimension.

This is the reason we call such dimensions role-playing dimensions. There are other types of dimensions with some differences, such as junk dimension and degenerate dimension. The junk dimension will be used for dimensions with very narrow member values records that will be in use for almost one data mart not conformed.

For example, the status dimensions can be good candidates for junk dimension. If you create a status dimension for each situation in each data mart, then you will probably have more than ten status dimensions with only less than five records in each. The junk dimension is a solution to combine such narrow dimensions together and create a bigger dimension.

You may or may not use a junk dimension in your data mart because using junk dimensions reduces readability, and not using it will increase the number of narrow dimensions. So, the usage of this is based on the requirement analysis phase and the dimensional modeling of the star schema. A degenerate dimension is another type of dimension, which is not a separate dimension table. In other words, a degenerate dimension doesn't have a table and it sits directly inside the Fact table.

Assume that you want to store the transaction number string value. Where do you think would be the best place to add that information? You may think that you would create another dimension and enter the transaction number there and assign a surrogate key and use that surrogate key in the Fact table. This is not an ideal solution because that dimension will have exactly the same Grain as your Fact table, and this indicates that the number of records for your sales transaction dimension will be equal to the Fact table, so you will have a very deep dimension table, which is not recommended.

On the other hand, you cannot think about another attribute for that dimension because all attributes related to the sales transaction already exist in other dimensions connected to the fact. So, instead of creating a dimension with the same Grain as the fact and with only one column, we would leave that column even if it is a string inside the Fact table.

This type of dimension will be called a degenerate dimension. Now that you understand dimensions, it is a good time to go into more detail about the most challengeable concepts of data warehousing, which is slowly changing dimension SCD. The dimension's attribute values may change depending on the requirement. You will do different actions to respond to that change. As the changes in the dimension's attribute values happen occasionally, this called the slowly changing dimension.

SCD depends on the action to be taken after the change is split in different types. In this section, we only discuss type 0, 1, and 2. Type 0 doesn't accept any changes. Let's assume that the Employee Number is inside the Employee dimension. Employee Number is the business key and it is an important attribute for ETL because ETL distinguishes new employees or existing employees based on this field. So we don't accept any changes in this attribute. This means that type 0 of SCD is applied on this attribute.

Sometimes, a value may be typed wrongly in the source system, such as the first name, and it is likely that someone will come and fix that with a change. This book starts with designing a data warehouse with dimensional modeling, and then looks at creating data models based on SSAS multidimensional and Tabular technologies.

Your email address will not be published. Time for action — changing a report configuration with a ReportViewer Object through code behind. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. Reza Rad has more than 10 years of experience in databases and software applications. Most of his work experience is in data warehousing and business intelligence.

He has a Bachelor's degree in Computer Engineering. He has worked with large enterprises around the world and delivered high-quality data warehousing and BI solutions for them.

He has worked with industries in different sectors, such as Health, Finance, Logistics, Sales, Order Management, Manufacturing, Telecommunication, and so on. Reza has written books on SQL Server and databases.

His blog contains the latest information on his presentations and publications. Reza is a Mentor and a Microsoft Certified Trainer.

He has been in the professional training business for many years. He conducts extensive handed-level training for many enterprises around the world via both remote and in-person training. He has worked for more than 10 years with Oracle Corporation and has held various positions, including that of a Practice Manager.

He had been co-running the North Business Intelligence and Warehouse Consulting practice, delivering business intelligence solutions to Fortune clients. During this time, he steadily added business skills and business training to his technical background.

In , John decided to leave Oracle and become a founding member in a small business named iSeerix. This allowed him to focus on strategic partnerships with clients to design and build Business Intelligence and data warehouse solutions.

John's strengths include the ability to communicate the benefits of introducing a Business Intelligence solution to a client's architecture. He has gradually become a trusted advisor to his clients. His philosophy is based on responsibility and mutual respect.

He relies on the unique abilities of individuals to ensure success in different areas and strives to foster a team environment of creativity and achievement. Through the years, he has worked in numerous industries with differing technologies. This broad experience base allows him to bring a unique perspective and understanding when designing and developing a data warehouse.

The strong business background, coupled with technical expertise, and his certification in Project Management makes him a valued asset to any data warehouse project. Goh Yong Hwee is a database specialist, systems engineer, developer, and trainer based in Singapore. Throughout his training, he has consistently maintained a Metrics that Matter score exceeding 8 out of He has also been instrumental in customizing and reviewing his training center's training for its clients.

When imparting knowledge, his objective has been to make technologies easy and simple for everyone to learn. His no-frills approach to training has gained him recognition over the years from both clients and employers, where his clinching of the Best Instructor Award, an accolade conferred by his employer, bore testimonial.

Over the years, he has chosen to focus his work and specialization on Microsoft SQL Server and is currently in full-time employment with a Fortune company in Singapore, taking up the specialist, consultancy, developer, and management roles. Raunak T. Jhawar is a graduate in Computer Science from the University of Pune and has more than five years of experience working as a software professional working with BI, data visualization, and Hadoop.

Raunak is presently working with Aditi Technologies in Bangalore as a Technical Leader, working with clients and consulting them for their BI and analytics engagements. He currently leads an ambitious BI project for Betgenius Ltd. He started his career as a software developer, and then he was a DBA for 12 years. He has been a permanent employee, consultant, contractor, and owner of his own business.

All these experiences, along with continuous learning, have helped him to develop many successful data warehouse and BI projects.

❿