|
Book details / order |
MICROSOFT SQL SERVER 2012 WITH HADOOP INTEGRATED DATA BETWEEN APACHE HADOOP AND SQL SERVER 2012 AND PROVIDE BUSINESS INTELLIGENCE ON THE HETEROGENEOUS DATA |
What you will learn from this book
use the native sqoop connector for data movement between sql server 2012 and hadoop
configure and use the hive odbc driver to enable any odbc compliant client to consume hadoop data
create etl solutions and automate data movement jobs between sql server 2012 and hadoop using sql server integration services
provide powerful reporting on the integrated data with just a matter of a few clicks using microsoft self-service bi tools
merge structured and unstructured data together in a common warehouse for analysis, which is essential
in detail
with the explosion of data, the open source apache hadoop ecosystem is gaining traction, thanks to its huge ecosystem that has arisen around the core functionalities of its distributed file system (hdfs) and map reduce. as of today, being able to have sql server talking to hadoop has become increasingly important because the two are indeed complementary. while petabytes of unstructured data can be stored in hadoop taking hours to be queried, terabytes of structured data can be stored in sql server 2012 and queried in seconds. this leads to the need to transfer and integrate data between hadoop and sql server.
microsoft sql server 2012 with hadoop is aimed at sql server developers. it will quickly show you how to get hadoop activated on sql server 2012 (it ships with this version). once this is done, the book will focus on how to manage big data with hadoop and use hadoop hive to query the data. it will also cover topics such as using in-memory functions by sql server and using tools for bi with big data.
microsoft sql server 2012 with hadoop focuses on data integration techniques between relational (sql server 2012) and non-relational (hadoop) worlds. it will walk you through different tools for the bi-directional movement of data with practical examples.
you will learn to use open source connectors like sqoop to import and export data between sql server 2012 and hadoop, and to work with leading in-memory bi tools to create etl solutions using the hive odbc driver for developing your data movement projects. finally, this book will give you a glimpse of the present day self-service bi tools such as excel and powerview to consume hadoop data and provide powerful insights on the data.
approach
this book will be a step-by-step tutorial, which practically teaches working with big data on sql server through sample examples in increasing complexity.
who this book is for
microsoft sql server 2012 with hadoop is specifically targeted at readers who want to cross-pollinate their hadoop skills with sql server 2012 business intelligence and data analytics. a basic understanding of traditional rdbms technologies and query processing techniques is essential.
about the author
debarchan sarkar is a microsoft data platform engineer who hails from calcutta, the "city of joy", india. he has been a seasoned sql server engineer with microsoft, india for the last six years and has now started venturing into the open source world, specifically the apache hadoop framework. he is a sql server business intelligence specialist with subject matter expertise in sql server integration services.
table of contents:
preface
chapter 1: introduction to big data and hadoop
big data – what's the big deal?
the apache hadoop framework
hdfs
mapreduce
namenode
secondary namenode
datanode
jobtracker
tasktracker
hive
pig
flume
sqoop
oozie
hbase
mahout
summary
chapter 2: using sqoop – the sql server hadoop connector
the sql server-hadoop connector
installation prerequisites
a hadoop cluster on linux
installing and configuring sqoop
setting up the microsoft jdbc driver
downloading the sql server-hadoop connector
installing the sql server-hadoop connector
the sqoop import tool
importing the tables in hive
the sqoop export tool
data types
summary
chapter 3: using the hive odbc driver
the hive odbc driver
sql server integration services (ssis)
ssis as an etl – extract, transform, and load tool
developing the package
creating the project
creating the data flow
creating the source hive connection
creating the destination sql connection
creating the hive source component
creating the sql destination component
mapping the columns
running the package
summary
chapter 4: creating a data model with sql server analysis services
configuring the sql linked server to hive
the linked server script
using openquery
creating a view
creating an ssas data model
summary
chapter 5: using microsoft's self-service business intelligence tools
powerpivot enhancements
power view for excel
summary
Author : Debarchan sarkar
Publication : Packt publication
Isbn : 9789351103271
Store book number : 105
NRS 320.00
|
|
|
|
|
|
|
|
|
|