Tải bản đầy đủ (.pdf) (432 trang)

MongoDB: The Definitive Guide, Second Edition

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (11.64 MB, 432 trang )



SECOND EDITION

MongoDB: The Definitive Guide

Kristina Chodorow


MongoDB: The Definitive Guide, Second Edition
by Kristina Chodorow
Copyright © 2013 Kristina Chodorow. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are
also available for most titles (). For more information, contact our corporate/
institutional sales department: 800-998-9938 or

Editor: Ann Spencer
Production Editor: Kara Ebrahim
Proofreader: Amanda Kersey
Indexer: Stephen Ingle, WordCo Indexing
May 2013:

Cover Designer: Randy Comer
Interior Designer: David Futato
Illustrator: Rebecca Demarest

Second Edition

Revision History for the Second Edition:


2013-05-08:

First release

See for release details.
Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of O’Reilly
Media, Inc. MongoDB: The Definitive Guide, Second Edition, the image of a mongoose lemur, and related
trade dress are trademarks of O’Reilly Media, Inc.
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as
trademarks. Where those designations appear in this book, and O’Reilly Media, Inc., was aware of a trade‐
mark claim, the designations have been printed in caps or initial caps.
While every precaution has been taken in the preparation of this book, the publisher and author assume no
responsibility for errors or omissions, or for damages resulting from the use of the information contained
herein.

ISBN: 978-1-449-34468-9
[LSI]


Table of Contents

Foreword. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

Part I.

Introduction to MongoDB

1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Ease of Use

Easy Scaling
Tons of Features…
…Without Sacrificing Speed
Let’s Get Started

3
3
4
5
5

2. Getting Started. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Documents
Collections
Dynamic Schemas
Naming
Databases
Getting and Starting MongoDB
Introduction to the MongoDB Shell
Running the Shell
A MongoDB Client
Basic Operations with the Shell
Data Types
Basic Data Types
Dates
Arrays
Embedded Documents
_id and ObjectIds

7

8
8
9
10
11
12
13
13
14
16
16
18
18
19
20
iii


Using the MongoDB Shell
Tips for Using the Shell
Running Scripts with the Shell
Creating a .mongorc.js
Customizing Your Prompt
Editing Complex Variables
Inconvenient Collection Names

21
22
23
25

26
27
27

3. Creating, Updating, and Deleting Documents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Inserting and Saving Documents
Batch Insert
Insert Validation
Removing Documents
Remove Speed
Updating Documents
Document Replacement
Using Modifiers
Upserts
Updating Multiple Documents
Returning Updated Documents
Setting a Write Concern

29
29
30
31
31
32
32
34
45
47
48
51


4. Querying. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Introduction to find
Specifying Which Keys to Return
Limitations
Query Criteria
Query Conditionals
OR Queries
$not
Conditional Semantics
Type-Specific Queries
null
Regular Expressions
Querying Arrays
Querying on Embedded Documents
$where Queries
Server-Side Scripting
Cursors
Limits, Skips, and Sorts
Avoiding Large Skips
Advanced Query Options

iv

|

Table of Contents

53
54

55
55
55
56
57
57
58
58
58
59
63
65
66
67
68
70
71


Getting Consistent Results
Immortal Cursors
Database Commands
How Commands Work

Part II.

72
75
75
76


Designing Your Application

5. Indexing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Introduction to Indexing
Introduction to Compound Indexes
Using Compound Indexes
How $-Operators Use Indexes
Indexing Objects and Arrays
Index Cardinality
Using explain() and hint()
The Query Optimizer
When Not to Index
Types of Indexes
Unique Indexes
Sparse Indexes
Index Administration
Identifying Indexes
Changing Indexes

81
84
89
91
95
98
98
102
102
104

104
106
107
108
108

6. Special Index and Collection Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Capped Collections
Creating Capped Collections
Sorting Au Naturel
Tailable Cursors
No-_id Collections
Time-To-Live Indexes
Full-Text Indexes
Search Syntax
Full-Text Search Optimization
Searching in Other Languages
Geospatial Indexing
Types of Geospatial Queries
Compound Geospatial Indexes
2D Indexes
Storing Files with GridFS
Getting Started with GridFS: mongofiles

109
111
112
113
114
114

115
118
119
119
120
120
121
122
123
124

Table of Contents

|

v


Working with GridFS from the MongoDB Drivers
Under the Hood

124
125

7. Aggregation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
The Aggregation Framework
Pipeline Operations
$match
$project
$group

$unwind
$sort
$limit
$skip
Using Pipelines
MapReduce
Example 1: Finding All Keys in a Collection
Example 2: Categorizing Web Pages
MongoDB and MapReduce
Aggregation Commands
count
distinct
group

127
129
129
130
135
137
139
139
139
140
140
140
143
143
146
146

147
147

8. Application Design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Normalization versus Denormalization
Examples of Data Representations
Cardinality
Friends, Followers, and Other Inconveniences
Optimizations for Data Manipulation
Optimizing for Document Growth
Removing Old Data
Planning Out Databases and Collections
Managing Consistency
Migrating Schemas
When Not to Use MongoDB

Part III.

153
154
157
158
160
160
162
162
163
164
165


Replication

9. Setting Up a Replica Set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Introduction to Replication
A One-Minute Test Setup

vi

|

Table of Contents

169
170


Configuring a Replica Set
rs Helper Functions
Networking Considerations
Changing Your Replica Set Configuration
How to Design a Set
How Elections Work
Member Configuration Options
Creating Election Arbiters
Priority
Hidden
Slave Delay
Building Indexes

174

175
176
176
178
180
181
182
183
184
185
185

10. Components of a Replica Set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Syncing
Initial Sync
Handling Staleness
Heartbeats
Member States
Elections
Rollbacks
When Rollbacks Fail

187
188
190
191
191
192
193
197


11. Connecting to a Replica Set from Your Application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
Client-to-Replica-Set Connection Behavior
Waiting for Replication on Writes
What Can Go Wrong?
Other Options for “w”
Custom Replication Guarantees
Guaranteeing One Server per Data Center
Guaranteeing a Majority of Nonhidden Members
Creating Other Guarantees
Sending Reads to Secondaries
Consistency Considerations
Load Considerations
Reasons to Read from Secondaries

199
200
201
202
202
202
204
204
205
205
205
206

12. Administration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Starting Members in Standalone Mode

Replica Set Configuration
Creating a Replica Set
Changing Set Members

209
210
210
211

Table of Contents

|

vii


Creating Larger Sets
Forcing Reconfiguration
Manipulating Member State
Turning Primaries into Secondaries
Preventing Elections
Using Maintenance Mode
Monitoring Replication
Getting the Status
Visualizing the Replication Graph
Replication Loops
Disabling Chaining
Calculating Lag
Resizing the Oplog
Restoring from a Delayed Secondary

Building Indexes
Replication on a Budget
How the Primary Tracks Lag
Master-Slave
Converting Master-Slave to a Replica Set
Mimicking Master-Slave Behavior with Replica Sets

Part IV.

211
212
213
213
213
213
214
214
216
218
218
219
220
221
222
223
224
225
226
226


Sharding

13. Introduction to Sharding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Introduction to Sharding
Understanding the Components of a Cluster
A One-Minute Test Setup

231
232
232

14. Configuring Sharding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
When to Shard
Starting the Servers
Config Servers
The mongos Processes
Adding a Shard from a Replica Set
Adding Capacity
Sharding Data
How MongoDB Tracks Cluster Data
Chunk Ranges
Splitting Chunks

viii

|

Table of Contents

241

242
242
243
244
245
245
246
247
249


The Balancer

253

15. Choosing a Shard Key. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Taking Stock of Your Usage
Picturing Distributions
Ascending Shard Keys
Randomly Distributed Shard Keys
Location-Based Shard Keys
Shard Key Strategies
Hashed Shard Key
Hashed Shard Keys for GridFS
The Firehose Strategy
Multi-Hotspot
Shard Key Rules and Guidelines
Shard Key Limitations
Shard Key Cardinality
Controlling Data Distribution

Using a Cluster for Multiple Databases and Collections
Manual Sharding

257
258
258
261
263
264
264
266
267
268
271
271
271
271
272
273

16. Sharding Administration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Seeing the Current State
Getting a Summary with sh.status
Seeing Configuration Information
Tracking Network Connections
Getting Connection Statistics
Limiting the Number of Connections
Server Administration
Adding Servers
Changing Servers in a Shard

Removing a Shard
Changing Config Servers
Balancing Data
The Balancer
Changing Chunk Size
Moving Chunks
Jumbo Chunks
Refreshing Configurations

Part V.

275
275
277
283
283
284
285
285
285
286
288
289
289
290
291
292
295

Application Administration


Table of Contents

|

ix


17. Seeing What Your Application Is Doing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
Seeing the Current Operations
Finding Problematic Operations
Killing Operations
False Positives
Preventing Phantom Operations
Using the System Profiler
Calculating Sizes
Documents
Collections
Databases
Using mongotop and mongostat

299
301
301
302
302
302
305
305
305

306
307

18. Data Administration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Setting Up Authentication
Authentication Basics
Setting Up Authentication
How Authentication Works
Creating and Deleting Indexes
Creating an Index on a Standalone Server
Creating an Index on a Replica Set
Creating an Index on a Sharded Cluster
Removing Indexes
Beware of the OOM Killer
Preheating Data
Moving Databases into RAM
Moving Collections into RAM
Custom-Preheating
Compacting Data
Moving Collections
Preallocating Data Files

311
312
314
314
315
315
315
316

316
317
317
317
318
318
320
321
322

19. Durability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
What Journaling Does
Planning Commit Batches
Setting Commit Intervals
Turning Off Journaling
Replacing Data Files
Repairing Data Files
The mongod.lock File
Sneaky Unclean Shutdowns
What MongoDB Does Not Guarantee

x

|

Table of Contents

323
324
325

325
325
325
326
327
327


Checking for Corruption
Durability with Replication

Part VI.

327
329

Server Administration

20. Starting and Stopping MongoDB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
Starting from the Command Line
File-Based Configuration
Stopping MongoDB
Security
Data Encryption
SSL Connections
Logging

333
336
336

337
338
338
338

21. Monitoring MongoDB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
Monitoring Memory Usage
Introduction to Computer Memory
Tracking Memory Usage
Tracking Page Faults
Minimizing Btree Misses
IO Wait
Tracking Background Flush Averages
Calculating the Working Set
Some Working Set Examples
Tracking Performance
Tracking Free Space
Monitoring Replication

341
341
342
343
345
346
346
348
350
350
352

353

22. Making Backups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Backing Up a Server
Filesystem Snapshot
Copying Data Files
Using mongodump
Backing Up a Replica Set
Backing Up a Sharded Cluster
Backing Up and Restoring an Entire Cluster
Backing Up and Restoring a Single Shard
Creating Incremental Backups with mongooplog

357
357
358
359
361
362
362
362
363

23. Deploying MongoDB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
Designing the System

365

Table of Contents


|

xi


Choosing a Storage Medium
Recommended RAID Configurations
CPU
Choosing an Operating System
Swap Space
Filesystem
Virtualization
Turn Off Memory Overcommitting
Mystery Memory
Handling Network Disk IO Issues
Using Non-Networked Disks
Configuring System Settings
Turning Off NUMA
Setting a Sane Readahead
Disabling Hugepages
Choosing a Disk Scheduling Algorithm
Don’t Track Access Time
Modifying Limits
Configuring Your Network
System Housekeeping
Synchronizing Clocks
The OOM Killer
Turn Off Periodic Tasks

365

369
370
370
371
371
372
372
372
373
374
374
374
377
378
379
380
380
382
383
383
383
384

A. Installing MongoDB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
B. MongoDB Internals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393

xii

|


Table of Contents


Foreword

In the last 10 years, the Internet has challenged relational databases in ways nobody
could have foreseen. Having used MySQL at large and growing Internet companies
during this time, I’ve seen this happen firsthand. First you have a single server with a
small data set. Then you find yourself setting up replication so you can scale out reads
and deal with potential failures. And, before too long, you’ve added a caching layer,
tuned all the queries, and thrown even more hardware at the problem.
Eventually you arrive at the point when you need to shard the data across multiple
clusters and rebuild a ton of application logic to deal with it. And soon after that you
realize that you’re locked into the schema you modeled so many months before.
Why? Because there’s so much data in your clusters now that altering the schema will
take a long time and involve a lot of precious DBA time. It’s easier just to work around
it in code. This can keep a small team of developers busy for many months. In the end,
you’ll always find yourself wondering if there’s a better way—or why more of these
features are not built into the core database server.
Keeping with tradition, the Open Source community has created a plethora of “better
ways” in response to the ballooning data needs of modern web applications. They span
the spectrum from simple in-memory key/value stores to complicated SQL-speaking
MySQL/InnoDB derivatives. But the sheer number of choices has made finding the right
solution more difficult. I’ve looked at many of them.
I was drawn to MongoDB by its pragmatic approach. MongoDB doesn’t try to be ev‐
erything to everyone. Instead it strikes the right balance between features and com‐
plexity, with a clear bias toward making previously difficult tasks far easier. In other
words, it has the features that really matter to the vast majority of today’s web applica‐
tions: indexes, replication, sharding, a rich query syntax, and a very flexible data model.

All of this comes without sacrificing speed.
Like MongoDB itself, this book is very straightforward and approachable. New
MongoDB users can start with Chapter 1 and be up and running in no time. Experienced
xiii


users will appreciate this book’s breadth and authority. It’s a solid reference for advanced
administrative topics such as replication, backups, and sharding, as well as popular client
APIs.
Having recently started to use MongoDB in my day job, I have no doubt that this book
will be at my side for the entire journey—from the first install to production deployment
of a sharded and replicated cluster. It’s an essential reference to anyone seriously looking
at using MongoDB.
—Jeremy Zawodny
Craigslist Software Engineer
August 2010

xiv

|

Foreword


Preface

How This Book Is Organized
This book is split up into six sections, covering development, administration, and de‐
ployment information.


Getting Started with MongoDB
In Chapter 1 we provide background about MongoDB: why it was created, the goals it
is trying to accomplish, and why you might choose to use it for a project. We go into
more detail in Chapter 2, which provides an introduction to the core concepts and
vocabulary of MongoDB. Chapter 2 also provides a first look at working with MongoDB,
getting you started with the database and the shell. The next two chapters cover the
basic material that developers need to know to work with MongoDB. In Chapter 3, we
describe how to perform those basic write operations, including how to do them with
different levels of safety and speed. Chapter 4 explains how to find documents and create
complex queries. This chapter also covers how to iterate through results and gives op‐
tions for limiting, skipping, and sorting results.

Developing with MongoDB
Chapter 5 covers what indexing is and how to index your MongoDB collections. Chap‐
ter 6 explains how to use several special types of indexes and collections. Chapter 7
covers a number of techniques for aggregating data with MongoDB, including counting,
finding distinct values, grouping documents, the aggregation framework, and using
MapReduce. Finally, this section finishes with a chapter on designing your application:
Chapter 8 goes over tips for writing an application that works well with MongoDB.

xv


Replication
The replication section starts with Chapter 9, which gives you a quick way to set up a
replica set locally and covers many of the available configuration options. Chapter 10
then covers the various concepts related to replication. Chapter 11 shows how replica‐
tion interacts with your application and Chapter 12 covers the administrative aspects
of running a replica set.


Sharding
The sharding section starts in Chapter 13 with a quick local setup. Chapter 14 then gives
an overview of the components of the cluster and how to set them up. Chapter 15 has
advice on choosing a shard key for a variety of application. Finally, Chapter 16 covers
administering a sharded cluster.

Application Administration
The next two chapters cover many aspects of MongoDB administration from the per‐
spective of your application. Chapter 17 discusses how to introspect what MongoDB is
doing. Chapter 18 covers administrative tasks such as building indexes, and moving
and compacting data. Chapter 19 explains how MongoDB stores data durably.

Server Administration
The final section is focused on server administration. Chapter 20 covers common op‐
tions when starting and stopping MongoDB. Chapter 21 discusses what to look for and
how to read stats when monitoring. Chapter 22 describes how to take and restore back‐
ups for each type of deployment. Finally, Chapter 23 discusses a number of system
settings to keep in mind when deploying MongoDB.

Appendixes
Appendix A explains MongoDB’s versioning scheme and how to install it on Windows,
OS X, and Linux. Appendix B details ow MongoDB works internally: its storage engine,
data format, and wire protocol.

Conventions Used in This Book
The following typographical conventions are used in this book:
Italic
Indicates new terms, URLs, email addresses, collection names, database names,
filenames, and file extensions.


xvi

|

Preface


Constant width

Used for program listings, as well as within paragraphs to refer to program elements
such as variable or function names, command-line utilities, environment variables,
statements, and keywords.
Constant width bold

Shows commands or other text that should be typed literally by the user.
Constant width italic

Shows text that should be replaced with user-supplied values or by values deter‐
mined by context.
This icon signifies a tip, suggestion, or general note.

This icon indicates a warning or caution.

Using Code Examples
This book can help you get your job done. In general, you may use the code in this book
in your programs and documentation. You do not need to contact us for permission
unless you’re reproducing a significant portion of the code. For example, writing a pro‐
gram that uses several chunks of code from this book does not require permission.
Selling or distributing a CD-ROM of examples from O’Reilly books does require per‐
mission. Answering a question by citing this book and quoting example code does not

require permission. Incorporating a significant amount of example code from this book
into your product’s documentation does require permission.
We appreciate, but do not require, attribution. An attribution usually includes the
title, author, publisher, and ISBN. For example: “MongoDB: The Definitive Guide, Sec‐
ond Edition by Kristina Chodorow (O’Reilly). Copyright 2013 Kristina Chodorow,
978-1-449-34468-9.”
If you feel your use of code examples falls outside fair use or the permission given here,
feel free to contact us at

Preface

|

xvii


Safari® Books Online
Safari Books Online (www.safaribooksonline.com) is an on-demand
digital library that delivers expert content in both book and video
form from the world’s leading authors in technology and business.
Technology professionals, software developers, web designers, and business and crea‐
tive professionals use Safari Books Online as their primary resource for research, prob‐
lem solving, learning, and certification training.
Safari Books Online offers a range of product mixes and pricing programs for organi‐
zations, government agencies, and individuals. Subscribers have access to thousands of
books, training videos, and prepublication manuscripts in one fully searchable database
from publishers like O’Reilly Media, Prentice Hall Professional, Addison-Wesley Pro‐
fessional, Microsoft Press, Sams, Que, Peachpit Press, Focal Press, Cisco Press, John
Wiley & Sons, Syngress, Morgan Kaufmann, IBM Redbooks, Packt, Adobe Press, FT
Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, Course Technol‐

ogy, and dozens more. For more information about Safari Books Online, please visit us
online.

How to Contact Us
Please address comments and questions concerning this book to the publisher:
O’Reilly Media, Inc.
1005 Gravenstein Highway North
Sebastopol, CA 95472
800-998-9938 (in the United States or Canada)
707-829-0515 (international or local)
707 829-0104 (fax)
We have a web page for this book, where we list errata, examples, and any additional
information. You can access this page at:
/>To comment or ask technical questions about this book, send email to:

For more information about our books, conferences, Resource Centers, and the O’Reilly
Network, see our website at:


xviii

|

Preface


Acknowledgments
I would like to thank my tech reviewers, Adam Comerford, Eric Milke, and Greg Studer.
You guys made this book immeasurably better (and more correct). Thank you, Ann
Spencer, for being such a terrific editor and for helping me every step of the way. Thanks

to all of my coworkers at 10gen for sharing your knowledge and advice on MongoDB
as well as Eliot Horowitz and Dwight Merriman, for starting the MongoDB project. And
thank you, Andrew, for all of your support and suggestions.

Preface

|

xix



PART I

Introduction to MongoDB



CHAPTER 1

Introduction

MongoDB is a powerful, flexible, and scalable general-purpose database. It combines
the ability to scale out with features such as secondary indexes, range queries, sorting,
aggregations, and geospatial indexes. This chapter covers the major design decisions
that made MongoDB what it is.

Ease of Use
MongoDB is a document-oriented database, not a relational one. The primary reason
for moving away from the relational model is to make scaling out easier, but there are

some other advantages as well.
A document-oriented database replaces the concept of a “row” with a more flexible
model, the “document.” By allowing embedded documents and arrays, the documentoriented approach makes it possible to represent complex hierarchical relationships
with a single record. This fits naturally into the way developers in modern objectoriented languages think about their data.
There are also no predefined schemas: a document’s keys and values are not of fixed
types or sizes. Without a fixed schema, adding or removing fields as needed becomes
easier. Generally, this makes development faster as developers can quickly iterate. It is
also easier to experiment. Developers can try dozens of models for the data and then
choose the best one to pursue.

Easy Scaling
Data set sizes for applications are growing at an incredible pace. Increases in available
bandwidth and cheap storage have created an environment where even small-scale ap‐
plications need to store more data than many databases were meant to handle. A terabyte
of data, once an unheard-of amount of information, is now commonplace.
3


×