Tuesday, April 30, 2013

Big Data Analytics with Hadoop

A good presentation, it is helpfull from level of beginers to advance...


SAP Business Objects Data Services (BODS) Interview Questions with Answers

Learn the Answers of some critical questions commonly asked during SAP BO Data Services interview.
1. What is the use of BusinessObjects Data Services?
Answer:
BusinessObjects Data Services provides a graphical interface that allows you to easily create jobs that extract data from heterogeneous sources, transform that data to meet the business requirements of your organization, and load the data into a single location.

Monday, April 29, 2013

Time series analytics on Vertica


Gap Filling and Interpolation (GFI)

A Swiss-Army Knife for Time Series Analytics

Gap Filling and Interpolation (GFI) is a set of patent-pending time series analytics features in Vertica . In this post, through additional use cases, we will show that GFI can enable Vertica users in a wide range of industry sectors to achieve a diverse set of goals.

Rolling Average with Oracle or Vertica analytical functions.


This little example will demonstrate how to use Oracle's or Vertica's analytical functions to get the rolling average. First you have to create and load a table that contains each month's average temperature in Edinburgh in the years 1764-1820.

Large-Scale Processing in Netezza.


Transitioning from ETL to ELT

CIO: Why is that uber-powered [commodity RDBMS] system running out of steam? Didn’t we just upgrade?
MANAGER: Yes, but the upgrade didn’t take.
CIO: Didn’t take? Sounds like a doctor transplanting an organ. Do you mean the CPUs rejected it? (laughing)
MANAGER: (soberly) No, just the users. Still too slow.
CIO: That hardware plant cost us [X] million dollars and it had better get it done or I’ll dismantle it for parts. I might dismantle your prima-donna architects with it!

Enhanced Aggregation, Cube, Grouping and Rollup.


(OLAP reporting embedded in SQL)


Much of the OLAP reporting feature embedded in Oracle SQL is ignored. People turn to expensive OLAP reporting tools in the market - even for simple reporting needs. This article outlines some of the common OLAP reporting needs and shows how to meet them by using the enhanced aggregation features of Oracle SQL.

Sunday, April 28, 2013

Analytic functions by Example.


This article provides a clear, thorough concept of analytic functions and its various options by a series of simple yet concept building examples. The article is intended for SQL coders, who for might be not be using analytic functions due to unfamiliarity with its cryptic syntax or uncertainty about its logic of operation. Often I see that people tend to reinvent the feature provided by analytic functions by native join and sub-query SQL. This article assumes familiarity with basic Oracle SQL, sub-query, join and group function from the reader. Based on that familiarity, it builds the concept of analytic functions through a series of examples.

Installing Hadoop on Ubuntu (12.04) - single node


--Installing Java
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java7-installer

--Creating user
$ sudo addgroup hadoop
$ sudo adduser --ingroup hadoop hduser

Intro to Hadoop.


What is hadoop

Data is growing exponentially.  What’s not so clear is how to unlock the value it holds. Hadoop is the answer. Hadoop is an open-source software framework that supports data-intensive distributed applications, licensed under the Apache v2 license. Hadoop is written in the Java programming language. Hadoop was derived from Google's Map Reduce and Google File System (GFS) papers.
Google’s MapReduce provides:
  • Automatic parallelization and distribution
  • Fault-tolerance
  • I/O scheduling
  • Status and monitoring

Predictive Analytics with Data Mining: How It Works.



by Eric Siegel, Ph.D.
Published in DM Review's DM Direct, February 2005.
Although you've probably heard many times that predictive analytics will optimize your marketing campaigns, it's hard to envision, in more concrete terms, what it will do. This makes it tough to select and direct analytics technology. How can you get a handle on its functional value for marketing, sales and product directions without necessarily becoming an expert?

Seven Principles for Enterprise Data Warehouse Design


In previous columns, I've talked about how you can improve the likelihood of achieving your desired results in building a data management center of excellence and in managing enterprise information. This month, I'd like to narrow the focus to one particular aspect of the enterprise information management spectrum: data warehouse (DW) design.
Contrary to popular sentiment, data warehousing is not a moribund technology; it's alive and kicking. Indeed, most companies deploy data warehousing technology to some extent, and many have an enterprise-wide DW.

Wednesday, April 24, 2013

Integrating Hadoop into Business Intelligence and Data Warehousing: An Overview in 27 Tweets.


To help you better understand how Hadoop can be integrated into business intelligence (BE) and data warehousing (DW) and why you should care, I’d like to share with you the series of 27 tweets I recently issued on the topic. I think you’ll find the tweets interesting, because they provide an overview of these issues and best practices in a form that’s compact, yet amazingly comprehensive.

Every tweet I wrote was a short sound bite or stat bite drawn from my recent TDWI report “Integrating Hadoop in Business Intelligence and Data Warehousing.” Many of the tweets focus on a statistic cited in the report, while other tweets are definitions stated in the report.

Monday, April 22, 2013

Hadoop Interview Question


1.What is Hadoop framework?
Answer:
Hadoop is a open source framework which is written in java by apache software foundation. This framework is used to write software application which requires to process vast amount of data (It could handle multi tera bytes of data)It works in-parallel on large clusters which could have 1000 of computers (Nodes) on the clusters. It also process data very reliably and fault-tolerant manner.
2.On What concept the Hadoop framework works?
Answer:
It works on MapReduce, and it is devised by the Google.
3.What is MapReduce ?

Understanding Hadoop Clusters and the Network


This article is Part 1 in series that will take a closer look at the architecture and methods of a Hadoop cluster, and how it relates to the network and server infrastructure.  The content presented here is largely based on academic work and conversations I’ve had with customers running real production clusters.  If you run production Hadoop clusters in your data center, I’m hoping you’ll provide your valuable insight in the comments below.  Subsequent articles to this will cover the server and network architecture options in closer detail.

How To Build Optimal Hadoop Cluster ( Hadoop recommendations)


Preface

Amount of data stored in database/files is growing every day, using this fact there become a need to build cheaper, mainatenable and scalable environments capable of storing  big amounts of data („Big Data“). Conventional RDBMS systems became too expensive and not scalable based on today’s needs, so it is time to use/develop new techinques that will be able to satisfy our needs.
One of the technologies that lead in these directions is Cloud computing. There are different implementation of Cloud computing but we selected Hadoop – MapReduce framework with Apache licence based on Google Map Reduce framework.

Sunday, April 21, 2013

Making Big Data and BI Work Together


For enterprise IT and the end-users it supports, the interplay between big data and B.I. can prove as exciting as it is frustrating.

As enterprise executives and end-users eagerly look to gain meaningful intelligence and fast time-to-insight from deep wells of rich data—enabling them to react more quickly and intelligently to market conditions, deliver better customer service, streamline internal operations, and differentiate the organization from among the competition—IT is charged with facilitating such desires for agility even as rivers of data continue to pour into the organization.
With storage costs low enough to easily and cost-effectively store vast amounts of data, many IT organizations opt to store virtually everything they can. While that satiates some of the desires demanded by end-users, it increases the pressure on the makers of B.I. tools to create offerings robust enough to make meaningful, quick, and accurate sense of all available data.

DWH Concepts and Fundamentals


Data Warehouse

As per Bill Inmon "A warehouse is a Historicalsubject-orientedintegratedtime-variant and non-volatilecollection of data in support of management's decision making process".
By Historical we mean, the data is continuously collected from sources and loaded in the warehouse. The previously loaded data is not deleted for long period of time. This results in building historical data in the warehouse.
By Subject Oriented we mean data grouped into a particular business area instead of the business as a whole.
By Integrated we mean, collecting and merging data from various sources. These sources could be disparate in nature.
By Time-variant we mean that all data in the data warehouse is identified with a particular time period.
By Non-volatile we mean, data that is loaded in the warehouse is based on business transactions in the past, hence it is not expected to change over time.

Data WareHousing Websites



Ralph Kimball
Associates
Ralph Kimball Associates focuses on developing, teaching, and delivering dimensional data warehouse design techniques for the community of IT professionals.
The OLAP ReportThe OLAP Report website is a vendor-independent, research-based source of information regarding analytical processing of information. It provides detailed, unbiased and regularly updated information on the OLAP market and OLAP products.
The Data
Warehousing
Institute
The Data Warehouse Institute (TDWI) is the premier provider of in-depth, high quality education and training in the data warehousing and business intelligence industry.
DM ReviewThe DM Review website is an excellent data warehousing resource focused on business intelligence.
Business
Intelligence
Network
The B-EYE-Network serves the business intelligence and data warehousing community with unparalleled industry coverage and resources. In response to the growing need for a more sophisticated online resource, the B-EYE-Network delivers industry based content hosted by domain experts and includes horizontal technology coverage from the most respected thought leaders in the BI and DW industry.