Tải bản đầy đủ (.pdf) (34 trang)

8. b What is Big Data English(1)

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (12.32 MB, 34 trang )

Paul Zikopoulos
Director - IM WW Technical Professionals, WW Competitive Database,
WW Big Data Tiger Team

What is Big Data?

© 2009 IBM Corporation


Agenda

What is Big Data?
What makes Big Data different?
What can you do with Big Data?
Big Data use cases
The IBM Big Data platform
Getting started

2

© 2012 IBM Corporation


New IM Technology Trends
Information Integration and Governance & Big Data
Trusted

Relevant

Governed


Transactional
& Collaborative
Applications

Analyze
Integrate

Business Analytics
Applications

Content
Big Data

Manage

Master
Data

Cubes

Streams
Data
External
Information
Sources

Data
Warehouses

Content


Information
Governance

Streaming
Information

Govern
Security &
Privacy
Quality

3

Standards

Lifecycle

© 2012 IBM Corporation


4
© 2012 IBM Corporation

…by the end of 2011, this was about
30 billion and growing even faster

In 2005 there were 1.3 billion RFID
tags in circulation…



An increasingly sensor-enabled and instrumented
business environment generates HUGE volumes of
data with MACHINE SPEED characteristics…

5

1 BILLION lines of code
EACH engine generating 10 TB every 30 minutes!

© 2012 IBM Corporation


350B
Transactions/Year

Meter Reads
every 15 min.

120M – meter reads/month
6

3.65B – meter reads/day
© 2012 IBM Corporation


In August of 2010, Adam
Savage, of “Myth Busters,”
took a photo of his vehicle
using his smartphone. He

then posted the photo to his
Twitter account including the
phrase “Off to work.”
Since the photo was taken by
his smartphone, the image
contained metadata revealing
the exact geographical
location the photo was taken
By simply taking and posting
a photo, Savage revealed the
exact location of his home,
the vehicle he drives, and the
time he leaves for work
7

© 2012 IBM Corporation


The Social Layer in a Instrumented Interconnected World
30 billion RFID
12+ TBs

tags today
(1.3B in 2005)

devices
sold
annually

2+

billion

25+ TBs of
log data
every day

76 million smart
meters in 2009…
200M by 2014
8

camera
phones
world
wide

100s of
millions
of GPS
enabled

data every day

? TBs of

of tweet data
every day

4.6
billion


people
on the
Web by
end 2011
© 2012 IBM Corporation


Twitter Tweets per Second Record Breakers of 2011

9

© 2012 IBM Corporation


Can a Social Media Persona be Monetized?

10

© 2012 IBM Corporation


Extract Intent, Life Events, Micro Segmentation Attributes
Chloe

Name, Birthday, Family

Tom Sit

Not Relevant - Noise

Tina Mu

Monetizable Intent
Jo Jobs

Not Relevant - Noise

11

Location

Wishful Thinking

Relocation

SPAMbots

Monetizable Intent

© 2012 IBM Corporation


1.8 ZB

1 ZB
1 ZB=1T GB

4Trillion
8GB
iPods


12

© 2012 IBM Corporation


What is “BIG DATA”?
All kinds of data
Large volumes
Valuable insight, but difficult to extract
Often extremely time sensitive

13

© 2012 IBM Corporation


What makes big data technology different?

Jobs distributed across affordable hardware.
Manages and analyzes all kinds of data.
Analyzes data in native format.

14

© 2012 IBM Corporation


Big Data Includes Any of the following Characteristics
Extracting insight from an immense volume, variety and velocity of data,

in context, beyond what was previously possible

Variety:

Manage the complexity of
data in many different
structures, ranging from
relational, to logs,
to raw text

Velocity: Streaming data and large
volume data movement
Volume: Scale from Terabytes to
Petabytes (1K TBs) to
Zetabytes (1B TBs)
15

© 2012 IBM Corporation


What can you do with big data?
Analyze a Variety of Information
Analyze Information in Motion

Social media/sentiment
analysis
Geospatial analysis
Brand strategy
Scientific research
Epidemic early warning

system
Market analysis
Video analysis
Audio analysis

Smart Grid management
Multimodal surveillance
Real-time promotions
Cyber security
ICU monitoring
Options trading
Click-stream analysis
CDR processing
IT log analysis
RFID tracking & analysis

Discovery & Experimentation
Analyze Extreme Volumes
of Information
Transaction analysis to create insight-based
product/service offerings
Fraud modeling & detection
Risk modeling & management
Social media/sentiment analysis
Environmental analysis
Manage

and Plan

Sentiment analysis

Brand strategy
Scientific research
Ad-hoc analysis
Model development
Hypothesis testing
Transaction analysis to create
insight-based product/service
offerings

Operational analytics – BI reporting
Planning and forecasting analysis
Predictive analysis

16

© 2012 IBM Corporation


Applications for Big Data Analytics
Smarter Healthcare

Multi-channel
sales

Finance

Log Analysis

Homeland Security


Traffic Control

Telecom

Search Quality

Fraud and Risk

Retail: Churn, NBO

Manufacturing

17

Trading Analytics

© 2012 IBM Corporation


What can you do with big data?
Financial Services
Fraud detection
Risk management
360° View of the Customer

Transportation
Weather and traffic
impact on logistics
and fuel
consumption


Health & Life Sciences
Epidemic early warning
system
ICU monitoring
Remote healthcare
monitoring

Telecommunications
CDR processing
Churn prediction
Geomapping / marketing
Network monitoring

18

Utilities
Weather impact analysis on
power generation
Transmission monitoring
Smart grid management

IT
Transition log
analysis for multiple
transactional
systems
Cybersecurity

Retail

360° View of the Customer
Click-stream analysis
Real-time promotions

Law Enforcement
Real-time multimodal surveillance
Situational awareness
Cyber security detection

© 2012 IBM Corporation


The Big Data Conundrum
The economies of deletion have changed….
– Leading us into new opportunities and challenges

The percentage of available data an enterprise can analyze is
decreasing proportionately to the available to that enterprise
Quite simply, this means as enterprises, we are getting
“more naive” about our business over time
Data AVAILABLE to
an organization

Data an organization
can PROCESS
19

© 2012 IBM Corporation



Public wind data is available on 284km
x 284 km grids (2.5o LAT/LONG)
More data means more accurate and
richer models (adding hundreds of
variables)
- Vestas wind library at 2.5 PB: to grow to
over 6 PB in the near-term
- Granularity 27km x 27km grids: driving to
9x9, 3x3 to 10m x 10m simulations

Reduced turbine placement
identification from weeks to hours

20

20

Perspective: The Vestas Wind library,
as HD TV would take 70 years ©to
watch
2012 IBM Corporation


Optimize building energy
consumption with centralized
monitoring and control of
building monitoring system
Automates preventive and
corrective maintenance of
building corrective systems

Uses Streams, InfoSphere
BigInsights and Cognos

21

21

-

Log Analytics
Energy Bill Forecasting
Energy consumption optimization
Detection of anomalous usage
Presence-aware energy mgt.
Policy enforcement © 2012 IBM Corporation


Supply Chain Recommendation for Natural Disasters

Capture market
data to calculate
cost of
stock outs
(high volume)

Capture weather sensor data, analyses hurricane
predicted path

22


Estimate
impact on
inventories

Compute shipping
and logistics costs
Make
recommendations
and notify

DHTML Result
rendering

© 2012 IBM Corporation


Correlate combined risk and
impending weather threats to
optimize inventory and
determine supply chain
recommendations

Dynamically updated
risk assessment
for assets in
projected path

Real-time projections
of hurricane path


23

© 2012 IBM Corporation


Bigger and Bigger Volumes of Data
Retailers collect click-stream data from Web site interactions and loyalty card-drive transaction data
– This traditional POS information is used by retailer for shopping basket analysis, inventory
replenishment, +++
– But data is being provided to suppliers for customer buying analysis

Healthcare has traditionally been dominated by paper-based systems, but this information is getting
digitized

Science is increasingly dominated by big science initiatives
– Large-scale experiments generate over 15 PB of data a year and can’t be stored within the data center;
then sent to laboratories

Financial services are seeing larger volumes through smaller trading sizes, increased market
volatility, and technological improvements in automated and algorithmic trading

Improved instrument and sensory technology
– Large Synoptic Survey Telescope’s GPixel camera generates 6PB+ of image data per year or consider
Oil and Gas industry

24

© 2012 IBM Corporation



Monetizing Relationships, Not Just Transactions
Calling Network

Amy Bearn

How valuable is Amy to my mobile
phone network? How likely is she to
switch carriers? How many other
customers will follow

Retailer

32, Married, mother of 3,
Accountant
Telco Score: 91
CPG Score: 76
Fashion Score: 88

Telco
company

Merged Network

Social Network
25

Public
Database

How valuable is Amy to my retail

sales? Who does she influence?
What do they spend?
© 2012 IBM Corporation


×