Tải bản đầy đủ (.pdf) (846 trang)

Python Cookbook 2nd Edition Oreilly _ www.bit.ly/taiho123

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (3.22 MB, 846 trang )



Download from Wow! eBook <www.wowebook.com>

Python Cookbook



Other resources from O’Reilly
Related titles

oreilly.com

Python in a Nutshell
Python Pocket Reference
Learning Python

Programming Python
Python Standard Library

oreilly.com is more than a complete catalog of O’Reilly books.
You’ll also find links to news, events, articles, weblogs, sample
chapters, and code examples.
oreillynet.com is the essential portal for developers interested in
open and emerging technologies, including new platforms, programming languages, and operating systems.

Conferences

O’Reilly brings diverse innovators together to nurture the ideas
that spark revolutionary industries. We specialize in documenting the latest tools and systems, translating the innovator’s
knowledge into useful skills for those in the trenches. Visit conferences.oreilly.com for our upcoming events.


Safari Bookshelf (safari.oreilly.com) is the premier online reference library for programmers and IT professionals. Conduct
searches across more than 1,000 books. Subscribers can zero in
on answers to time-critical questions in a matter of seconds.
Read the books on your Bookshelf from cover to cover or simply flip to the page you need. Try it today with a free trial.


SECOND EDITION

Python Cookbook

Edited by Alex Martelli,
Anna Martelli Ravenscroft, and David Ascher

Beijing • Cambridge • Farnham • Köln • Paris • Sebastopol • Taipei • Tokyo




Python Cookbook™, Second Edition
Edited by Alex Martelli, Anna Martelli Ravenscroft, and David Ascher
Compilation copyright © 2005, 2002 O’Reilly Media, Inc. All rights reserved.
Printed in the United States of America.
Copyright of original recipes is retained by the individual authors.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions
are also available for most titles (safari.oreilly.com). For more information, contact our corporate/institutional sales department: (800) 998-9938 or

Editor:

Jonathan Gennick


Production Editor:

Darren Kelly

Cover Designer:

Emma Colby

Interior Designer:

David Futato

Production Services:

Nancy Crumpton

Printing History:
July 2002:

First Edition.

March 2005:

Second Edition.

Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of
O’Reilly Media, Inc. The Cookbook series designations, Python Cookbook, the image of a springhaas,
and related trade dress are trademarks of O’Reilly Media, Inc.
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as

trademarks. Where those designations appear in this book, and O’Reilly Media, Inc. was aware of a
trademark claim, the designations have been printed in caps or initial caps.
While every precaution has been taken in the preparation of this book, the publisher and authors
assume no responsibility for errors or omissions, or for damages resulting from the use of the
information contained herein.

This book uses RepKover™, a durable and flexible lay-flat binding.
ISBN-10: 0-596-00797-3
ISBN-13: 978-0-596-00797-3
[M]

[11/07]


Table of Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
1. Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
1.10
1.11
1.12

1.13
1.14
1.15
1.16
1.17
1.18
1.19
1.20
1.21
1.22
1.23
1.24
1.25

Processing a String One Character at a Time
Converting Between Characters and Numeric Codes
Testing Whether an Object Is String-like
Aligning Strings
Trimming Space from the Ends of a String
Combining Strings
Reversing a String by Words or Characters
Checking Whether a String Contains a Set of Characters
Simplifying Usage of Strings’ translate Method
Filtering a String for a Set of Characters
Checking Whether a String Is Text or Binary
Controlling Case
Accessing Substrings
Changing the Indentation of a Multiline String
Expanding and Compressing Tabs
Interpolating Variables in a String

Interpolating Variables in a String in Python 2.4
Replacing Multiple Patterns in a Single Pass
Checking a String for Any of Multiple Endings
Handling International Text with Unicode
Converting Between Unicode and Plain Strings
Printing Unicode Characters to Standard Output
Encoding Unicode Data for XML and HTML
Making Some Strings Case-Insensitive
Converting HTML Documents to Text on a Unix Terminal

7
8
9
11
12
12
15
16
20
22
25
26
28
31
32
35
36
38
41
43

45
48
49
52
55
v


2. Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
2.10
2.11
2.12
2.13
2.14
2.15
2.16
2.17
2.18
2.19
2.20
2.21

2.22
2.23
2.24
2.25
2.26
2.27
2.28
2.29
2.30

Reading from a File
Writing to a File
Searching and Replacing Text in a File
Reading a Specific Line from a File
Counting Lines in a File
Processing Every Word in a File
Using Random-Access Input/Output
Updating a Random-Access File
Reading Data from zip Files
Handling a zip File Inside a String
Archiving a Tree of Files into a Compressed tar File
Sending Binary Data to Standard Output Under Windows
Using a C++-like iostream Syntax
Rewinding an Input File to the Beginning
Adapting a File-like Object to a True File Object
Walking Directory Trees
Swapping One File Extension for Another
Throughout a Directory Tree
Finding a File Given a Search Path
Finding Files Given a Search Path and a Pattern

Finding a File on the Python Search Path
Dynamically Changing the Python Search Path
Computing the Relative Path from One Directory to Another
Reading an Unbuffered Character in a Cross-Platform Way
Counting Pages of PDF Documents on Mac OS X
Changing File Attributes on Windows
Extracting Text from OpenOffice.org Documents
Extracting Text from Microsoft Word Documents
File Locking Using a Cross-Platform API
Versioning Filenames
Calculating CRC-64 Cyclic Redundancy Checks

62
66
67
68
69
72
74
75
77
79
80
82
83
84
87
88
90
91

92
93
94
96
98
99
100
101
102
103
105
107

3. Time and Money . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.1
3.2
3.3
3.4

vi |

Calculating Yesterday and Tomorrow
Finding Last Friday
Calculating Time Periods in a Date Range
Summing Durations of Songs

Table of Contents

116
118

120
121


3.5
3.6
3.7
3.8
3.9
3.10
3.11
3.12
3.13
3.14
3.15
3.16

Calculating the Number of Weekdays Between Two Dates
Looking up Holidays Automatically
Fuzzy Parsing of Dates
Checking Whether Daylight Saving Time Is Currently in Effect
Converting Time Zones
Running a Command Repeatedly
Scheduling Commands
Doing Decimal Arithmetic
Formatting Decimals as Currency
Using Python as a Simple Adding Machine
Checking a Credit Card Checksum
Watching Foreign Exchange Rates


122
124
127
129
130
131
133
135
137
140
143
144

4. Python Shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9
4.10
4.11
4.12
4.13
4.14
4.15
4.16

4.17
4.18
4.19
4.20
4.21
4.22
4.23

Copying an Object
Constructing Lists with List Comprehensions
Returning an Element of a List If It Exists
Looping over Items and Their Indices in a Sequence
Creating Lists of Lists Without Sharing References
Flattening a Nested Sequence
Removing or Reordering Columns in a List of Rows
Transposing Two-Dimensional Arrays
Getting a Value from a Dictionary
Adding an Entry to a Dictionary
Building a Dictionary Without Excessive Quoting
Building a Dict from a List of Alternating Keys and Values
Extracting a Subset of a Dictionary
Inverting a Dictionary
Associating Multiple Values with Each Key in a Dictionary
Using a Dictionary to Dispatch Methods or Functions
Finding Unions and Intersections of Dictionaries
Collecting a Bunch of Named Items
Assigning and Testing with One Statement
Using printf in Python
Randomly Picking Items with Given Probabilities
Handling Exceptions Within an Expression

Ensuring a Name Is Defined in a Given Module

Table of Contents |

148
151
153
154
155
157
160
161
163
165
166
168
170
171
173
175
176
178
180
183
184
185
187

vii



5. Searching and Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
5.10
5.11
5.12
5.13
5.14
5.15

Sorting a Dictionary
Sorting a List of Strings Case-Insensitively
Sorting a List of Objects by an Attribute of the Objects
Sorting Keys or Indices Based on the Corresponding Values
Sorting Strings with Embedded Numbers
Processing All of a List’s Items in Random Order
Keeping a Sequence Ordered as Items Are Added
Getting the First Few Smallest Items of a Sequence
Looking for Items in a Sorted Sequence
Selecting the nth Smallest Element of a Sequence
Showing off quicksort in Three Lines
Performing Frequent Membership Tests on a Sequence

Finding Subsequences
Enriching the Dictionary Type with Ratings Functionality
Sorting Names and Separating Them by Initials

195
196
198
200
203
204
206
208
210
212
215
217
220
222
226

6. Object-Oriented Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9

6.10
6.11
6.12
6.13
6.14
6.15
6.16
6.17
6.18

viii |

Converting Among Temperature Scales
Defining Constants
Restricting Attribute Setting
Chaining Dictionary Lookups
Delegating Automatically as an Alternative to Inheritance
Delegating Special Methods in Proxies
Implementing Tuples with Named Items
Avoiding Boilerplate Accessors for Properties
Making a Fast Copy of an Object
Keeping References to Bound Methods
Without Inhibiting Garbage Collection
Implementing a Ring Buffer
Checking an Instance for Any State Changes
Checking Whether an Object Has Necessary Attributes
Implementing the State Design Pattern
Implementing the “Singleton” Design Pattern
Avoiding the “Singleton” Design Pattern with the Borg Idiom
Implementing the Null Object Design Pattern

Automatically Initializing Instance Variables
from __init__ Arguments

Table of Contents

235
238
240
242
244
247
250
252
254
256
259
262
266
269
271
273
277
280


6.19 Calling a Superclass __init__ Method If It Exists
6.20 Using Cooperative Supercalls Concisely and Safely

282
285


7. Persistence and Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
7.9
7.10
7.11
7.12
7.13
7.14
7.15
7.16
7.17
7.18
7.19

Serializing Data Using the marshal Module
Serializing Data Using the pickle and cPickle Modules
Using Compression with Pickling
Using the cPickle Module on Classes and Instances
Holding Bound Methods in a Picklable Way
Pickling Code Objects
Mutating Objects with shelve
Using the Berkeley DB Database

Accesssing a MySQL Database
Storing a BLOB in a MySQL Database
Storing a BLOB in a PostgreSQL Database
Storing a BLOB in a SQLite Database
Generating a Dictionary Mapping Field Names to Column Numbers
Using dtuple for Flexible Access
to Query Results
Pretty-Printing the Contents of Database Cursors
Using a Single Parameter-Passing Style
Across Various DB API Modules
Using Microsoft Jet via ADO
Accessing a JDBC Database from a Jython Servlet
Using ODBC to Get Excel Data with Jython

291
293
296
297
300
302
305
307
310
312
313
315
316
318
320
323

325
327
330

8. Debugging and Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
8.9
8.10
8.11

Disabling Execution of Some Conditionals and Loops
Measuring Memory Usage on Linux
Debugging the Garbage-Collection Process
Trapping and Recording Exceptions
Tracing Expressions and Comments in Debug Mode
Getting More Information from Tracebacks
Starting the Debugger Automatically After an Uncaught Exception
Running Unit Tests Most Simply
Running Unit Tests Automatically
Using doctest with unittest in Python 2.4
Checking Values Against Intervals in Unit Testing

333

334
336
337
339
342
345
346
348
350
352

Table of Contents |

ix


9. Processes, Threads, and Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
9.1
9.2
9.3
9.4
9.5
9.6
9.7
9.8
9.9
9.10
9.11
9.12
9.13


Synchronizing All Methods in an Object
Terminating a Thread
Using a Queue.Queue as a Priority Queue
Working with a Thread Pool
Executing a Function in Parallel on Multiple Argument Sets
Coordinating Threads by Simple Message Passing
Storing Per-Thread Information
Multitasking Cooperatively Without Threads
Determining Whether Another Instance of a Script
Is Already Running in Windows
Processing Windows Messages Using MsgWaitForMultipleObjects
Driving an External Process with popen
Capturing the Output and Error Streams
from a Unix Shell Command
Forking a Daemon Process on Unix

359
362
364
366
369
372
374
378
380
381
384
386
388


10. System Administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
10.1
10.2
10.3
10.4
10.5
10.6
10.7
10.8
10.9
10.10
10.11
10.12
10.13
10.14
10.15
10.16
10.17

x

|

Generating Random Passwords
Generating Easily Remembered Somewhat-Random Passwords
Authenticating Users by Means of a POP Server
Calculating Apache Hits per IP Address
Calculating the Rate of Client Cache Hits on Apache
Spawning an Editor from a Script

Backing Up Files
Selectively Copying a Mailbox File
Building a Whitelist of Email Addresses From a Mailbox
Blocking Duplicate Mails
Checking Your Windows Sound System
Registering or Unregistering a DLL on Windows
Checking and Modifying the Set of Tasks Windows
Automatically Runs at Login
Creating a Share on Windows
Connecting to an Already Running Instance of Internet Explorer
Reading Microsoft Outlook Contacts
Gathering Detailed System Information on Mac OS X

Table of Contents

393
394
397
398
400
401
403
405
406
408
410
411
412
414
415

416
418


11. User Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
11.1
11.2
11.3
11.4
11.5
11.6
11.7
11.8
11.9
11.10
11.11
11.12
11.13
11.14
11.15
11.16
11.17
11.18
11.19

Showing a Progress Indicator on a Text Console
Avoiding lambda in Writing Callback Functions
Using Default Values and Bounds with tkSimpleDialog Functions
Adding Drag and Drop Reordering to a Tkinter Listbox
Entering Accented Characters in Tkinter Widgets

Embedding Inline GIFs Using Tkinter
Converting Among Image Formats
Implementing a Stopwatch in Tkinter
Combining GUIs and Asynchronous I/O with Threads
Using IDLE’s Tree Widget in Tkinter
Supporting Multiple Values per Row in a Tkinter Listbox
Copying Geometry Methods and Options Between Tkinter Widgets
Implementing a Tabbed Notebook for Tkinter
Using a wxPython Notebook with Panels
Implementing an ImageJ Plug-in in Jython
Viewing an Image from a URL with Swing and Jython
Getting User Input on Mac OS
Building a Python Cocoa GUI Programmatically
Implementing Fade-in Windows with IronPython

424
426
427
428
430
432
434
437
439
443
445
448
451
453
455

456
456
459
461

12. Processing XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
12.1
12.2
12.3
12.4
12.5
12.6
12.7
12.8
12.9
12.10
12.11

Checking XML Well-Formedness
Counting Tags in a Document
Extracting Text from an XML Document
Autodetecting XML Encoding
Converting an XML Document into a Tree of Python Objects
Removing Whitespace-only Text Nodes
from an XML DOM Node’s Subtree
Parsing Microsoft Excel’s XML
Validating XML Documents
Filtering Elements and Attributes Belonging to a Given Namespace
Merging Continuous Text Events with a SAX Filter
Using MSHTML to Parse XML or HTML


465
467
468
469
471
474
475
477
478
480
483

13. Network Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
13.1 Passing Messages with Socket Datagrams
13.2 Grabbing a Document from the Web
13.3 Filtering a List of FTP Sites

487
489
490

Table of Contents |

xi


13.4
13.5
13.6

13.7
13.8
13.9
13.10
13.11
13.12
13.13
13.14
13.15
13.16
13.17

Getting Time from a Server via the SNTP Protocol
Sending HTML Mail
Bundling Files in a MIME Message
Unpacking a Multipart MIME Message
Removing Attachments from an Email Message
Fixing Messages Parsed by Python 2.4 email.FeedParser
Inspecting a POP3 Mailbox Interactively
Detecting Inactive Computers
Monitoring a Network with HTTP
Forwarding and Redirecting Network Ports
Tunneling SSL Through a Proxy
Implementing the Dynamic IP Protocol
Connecting to IRC and Logging Messages to Disk
Accessing LDAP Servers

491
492
495

497
499
501
503
506
511
513
516
519
522
524

14. Web Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
14.1
14.2
14.3
14.4
14.5
14.6
14.7
14.8
14.9
14.10
14.11
14.12
14.13
14.14

Testing Whether CGI Is Working
Handling URLs Within a CGI Script

Uploading Files with CGI
Checking for a Web Page’s Existence
Checking Content Type via HTTP
Resuming the HTTP Download of a File
Handling Cookies While Fetching Web Pages
Authenticating with a Proxy for HTTPS Navigation
Running a Servlet with Jython
Finding an Internet Explorer Cookie
Generating OPML Files
Aggregating RSS Feeds
Turning Data into Web Pages Through Templates
Rendering Arbitrary Objects with Nevow

527
530
532
533
535
536
538
541
542
543
545
548
552
554

15. Distributed Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558
15.1

15.2
15.3
15.4
15.5
15.6
15.7

xii

|

Making an XML-RPC Method Call
Serving XML-RPC Requests
Using XML-RPC with Medusa
Enabling an XML-RPC Server to Be Terminated Remotely
Implementing SimpleXMLRPCServer Niceties
Giving an XML-RPC Server a wxPython GUI
Using Twisted Perspective Broker

Table of Contents

561
562
564
566
567
569
571



15.8
15.9
15.10
15.11

Implementing a CORBA Server and Client
Performing Remote Logins Using telnetlib
Performing Remote Logins with SSH
Authenticating an SSL Client over HTTPS

574
576
579
582

16. Programs About Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584
16.1
16.2
16.3
16.4
16.5
16.6
16.7
16.8
16.9
16.10
16.11

Verifying Whether a String Represents a Valid Number
Importing a Dynamically Generated Module

Importing from a Module Whose Name Is Determined at Runtime
Associating Parameters with a Function (Currying)
Composing Functions
Colorizing Python Source Using the Built-in Tokenizer
Merging and Splitting Tokens
Checking Whether a String Has Balanced Parentheses
Simulating Enumerations in Python
Referring to a List Comprehension While Building It
Automating the py2exe Compilation
of Scripts into Windows Executables
16.12 Binding Main Script and Modules into One Executable on Unix

590
591
592
594
597
598
602
604
606
609
611
613

17. Extending and Embedding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
17.1
17.2
17.3
17.4

17.5
17.6
17.7
17.8
17.9
17.10

Implementing a Simple Extension Type
619
Implementing a Simple Extension Type with Pyrex
623
Exposing a C++ Library to Python
625
Calling Functions from a Windows DLL
627
Using SWIG-Generated Modules in a Multithreaded Environment 630
Translating a Python Sequence into a C Array
with the PySequence_Fast Protocol
631
Accessing a Python Sequence Item-by-Item with the Iterator Protocol 635
Returning None from a Python-Callable C Function
638
Debugging Dynamically Loaded C Extensions with gdb
639
Debugging Memory Problems
641

18. Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
18.1 Removing Duplicates from a Sequence
18.2 Removing Duplicates from a Sequence

While Maintaining Sequence Order
18.3 Generating Random Samples with Replacement
18.4 Generating Random Samples Without Replacement

647
649
653
654

Table of Contents |

xiii


18.5
18.6
18.7
18.8
18.9
18.10
18.11
18.12
18.13
18.14
18.15
18.16
18.17

Memoizing (Caching) the Return Values of Functions
Implementing a FIFO Container

Caching Objects with a FIFO Pruning Strategy
Implementing a Bag (Multiset) Collection Type
Simulating the Ternary Operator in Python
Computing Prime Numbers
Formatting Integers as Binary Strings
Formatting Integers as Strings in Arbitrary Bases
Converting Numbers to Rationals via Farey Fractions
Doing Arithmetic with Error Propagation
Summing Numbers with Maximal Accuracy
Simulating Floating Point
Computing the Convex Hulls and Diameters of 2D Point Sets

656
658
660
662
666
669
671
673
675
677
680
682
685

19. Iterators and Generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689
19.1
19.2
19.3

19.4
19.5
19.6
19.7
19.8
19.9
19.10
19.11
19.12
19.13
19.14
19.15
19.16
19.17
19.18
19.19
19.20
19.21

xiv |

Writing a range-like Function with Float Increments
Building a List from Any Iterable
Generating the Fibonacci Sequence
Unpacking a Few Items in a Multiple Assignment
Automatically Unpacking the Needed Number of Items
Dividing an Iterable into n Slices of Stride n
Looping on a Sequence by Overlapping Windows
Looping Through Multiple Iterables in Parallel
Looping Through the Cross-Product of Multiple Iterables

Reading a Text File by Paragraphs
Reading Lines with Continuation Characters
Iterating on a Stream of Data Blocks as a Stream of Lines
Fetching Large Record Sets from a Database with a Generator
Merging Sorted Sequences
Generating Permutations, Combinations, and Selections
Generating the Partitions of an Integer
Duplicating an Iterator
Looking Ahead into an Iterator
Simplifying Queue-Consumer Threads
Running an Iterator in Another Thread
Computing a Summary Report with itertools.groupby

Table of Contents

693
695
697
698
700
702
704
708
710
713
715
717
719
721
724

726
728
731
734
735
737


20. Descriptors, Decorators, and Metaclasses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 740
20.1
20.2
20.3
20.4
20.5
20.6
20.7
20.8
20.9
20.10
20.11
20.12
20.13
20.14
20.15
20.16
20.17

Getting Fresh Default Values at Each Function Call
Coding Properties as Nested Functions
Aliasing Attribute Values

Caching Attribute Values
Using One Method as Accessor for Multiple Attributes
Adding Functionality to a Class by Wrapping a Method
Adding Functionality to a Class by Enriching All Methods
Adding a Method to a Class Instance at Runtime
Checking Whether Interfaces Are Implemented
Using __new__ and __init__ Appropriately in Custom Metaclasses
Allowing Chaining of Mutating List Methods
Using Cooperative Supercalls with Terser Syntax
Initializing Instance Attributes Without Using __init__
Automatic Initialization of Instance Attributes
Upgrading Class Instances Automatically on reload
Binding Constants at Compile Time
Solving Metaclass Conflicts

742
744
747
750
752
754
757
759
761
763
765
767
769
771
774

778
783

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 789

Table of Contents |

xv



Preface

This book is not a typical O’Reilly book, written as a cohesive manuscript by one or
two authors. Instead, it is a new kind of book—a bold attempt at applying some
principles of open source development to book authoring. Over 300 members of the
Python community contributed materials to this book. In this Preface, we, the editors, want to give you, the reader, some background regarding how this book came
about and the processes and people involved, and some thoughts about the implications of this new form.

The Design of the Book
In early 2000, Frank Willison, then Editor-in-Chief of O’Reilly & Associates, contacted me (David Ascher) to find out if I wanted to write a book. Frank had been the
editor for Learning Python, which I cowrote with Mark Lutz. Since I had just taken a
job at what was then considered a Perl shop (ActiveState), I didn’t have the bandwidth necessary to write another book, and plans for the project were gently shelved.
Periodically, however, Frank would send me an email or chat with me at a conference regarding some of the book topics we had discussed. One of Frank’s ideas was
to create a Python Cookbook, based on the concept first used by Tom Christiansen
and Nathan Torkington with the Perl Cookbook. Frank wanted to replicate the success of the Perl Cookbook, but he wanted a broader set of people to provide input.
He thought that, much as in a real cookbook, a larger set of authors would provide
for a greater range of tastes. The quality, in his vision, would be ensured by the oversight of a technical editor, combined with O’Reilly’s editorial review process.
Frank and Dick Hardt, ActiveState’s CEO, realized that Frank’s goal could be combined with ActiveState’s goal of creating a community site for open source programmers, called the ActiveState Programmer’s Network (ASPN). ActiveState had a
popular web site, with the infrastructure required to host a wide variety of content,

but it wasn’t in the business of creating original content. ActiveState always felt that

xvii
This is the Title of the Book, eMatter Edition
Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.


the open source communities were the best sources of accurate and up-to-date content, even if sometimes that content was hard to find.
The O’Reilly and ActiveState teams quickly realized that the two goals were aligned
and that a joint venture would be the best way to achieve the following key objectives:
• Creating an online repository of Python recipes by Python programmers for
Python programmers
• Publishing a book containing the best of those recipes, accompanied by overviews and background material written by key Python figures
• Learning what it would take to create a book with a different authoring model
At the same time, two other activities were happening. First, those of us at
ActiveState, including Paul Prescod, were actively looking for “stars” to join
ActiveState’s development team. One of the candidates being recruited was the
famous (but unknown to us, at the time) Alex Martelli. Alex was famous because of
his numerous and exhaustive postings on the Python mailing list, where he exhibited an unending patience for explaining Python’s subtleties and joys to the increasing audience of Python programmers. He was unknown because he lived in Italy
and, since he was a relative newcomer to the Python community, none of the old
Python hands had ever met him—their paths had not happened to cross back in the
1980s when Alex lived in the United States, working for IBM Research and enthusiastically using and promoting other high-level languages (at the time, mostly IBM’s
Rexx).
ActiveState wooed Alex, trying to convince him to move to Vancouver. We came
quite close, but his employer put some golden handcuffs on him, and somehow Vancouver’s weather couldn’t compete with Italy’s. Alex stayed in Italy, much to my disappointment. As it happened, Alex was also at that time negotiating with O’Reilly
about writing a book. Alex wanted to write a cookbook, but O’Reilly explained that
the cookbook was already signed. Later, Alex and O’Reilly signed a contract for
Python in Nutshell.
The second ongoing activity was the creation of the Python Software Foundation.
For a variety of reasons, best left to discussion over beers at a conference, everyone in

the Python community wanted to create a non-profit organization that would be the
holder of Python’s intellectual property, to ensure that Python would be on a legally
strong footing. However, such an organization needed both financial support and
buy-in from the Python community to be successful.
Given all these parameters, the various parties agreed to the following plan:
• ActiveState would build an online cookbook, a mechanism by which anyone
could submit a recipe (i.e., a snippet of Python code addressing a particular
problem, accompanied by a discussion of the recipe, much like a description of
why one should use cream of tartar when whipping egg whites). To foster a

xviii |

Preface
This is the Title of the Book, eMatter Edition
Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.


community of authors and encourage peer review, the web site would also let
readers of the recipes suggest changes, ask questions, and so on.
• As part of my ActiveState job, I would edit and ensure the quality of the recipes.
Alex Martelli joined the project as a co-editor when the material was being prepared for publication, and, with Anna Martelli Ravenscroft, took over as primary editor for the second edition.
• O’Reilly would publish the best recipes as the Python Cookbook.
• In lieu of author royalties for the recipes, a portion of the proceeds from the
book sales would be donated to the Python Software Foundation.

Download from Wow! eBook <www.wowebook.com>

The Implementation of the Book
The online cookbook (at was
the entry point for the recipes. Users got free accounts, filled in a form, and presto,

their recipes became part of the cookbook. Thousands of people read the recipes,
and some added comments, and so, in the publishing equivalent of peer review, the
recipes matured and grew. While it was predictable that the chance of getting your
name in print would get people attracted to the online cookbook, the ongoing success of the cookbook, with dozens of recipes added monthly and more and more references to it on the newsgroups, is a testament to the value it brings to the readers—
value which is provided by the recipe authors.
Starting from the materials available on the site, the implementation of the book was
mostly a question of selecting, merging, ordering, and editing the materials. A few
more details about this part of the work are in the “Organization” section of this
Preface.

Using the Code from This Book
This book is here to help you get your job done. In general, you may use the code in
this book in your programs and documentation. You do not need to contact us for
permission unless you’re reproducing a significant portion of the code. For example,
writing a program that uses several chunks of code from this book does not require
permission. Selling or distributing a CD-ROM of code taken from O’Reilly books
does require permission. Answering a question by citing this book and quoting
example code does not require permission. Incorporating a significant amount of
code from this book into your product’s documentation does require permission.
We appreciate, but do not require, attribution. An attribution usually includes the
title, author, publisher, and ISBN. For example: “Python Cookbook, 2d ed., by Alex
Martelli, Anna Martelli Ravenscroft, and David Ascher (O’Reilly Media, 2005) 0596-00797-3.” If you feel your use of code from this book falls outside fair use or the
permission given above, feel free to contact us at

Preface
This is the Title of the Book, eMatter Edition
Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.

| xix



Audience
We expect that you know at least some Python. This book does not attempt to teach
Python as a whole; rather, it presents some specific techniques and concepts (and
occasionally tricks) for dealing with particular tasks. If you are looking for an introduction to Python, consider some of the books described in the “Further Reading”
section of this Preface. However, you don’t need to know a lot of Python to find this
book helpful. Chapters include recipes demonstrating the best techniques for accomplishing some elementary and general tasks, as well as more complex or specialized
ones. We have also added sidebars, here and there, to clarify certain concepts which
are used in the book and which you may have heard of, but which might still be
unclear to you. However, this is definitely not a book just for beginners. The main
target audience is the whole Python community, mostly made up of pretty good programmers, neither newbies nor wizards. And if you do already know a lot about
Python, you may be in for a pleasant surprise! We’ve included recipes that explore
some the newest and least well-known areas of Python. You might very well learn a
few things—we did! Regardless of where you fall along the spectrum of Python
expertise, and more generally of programming skill, we believe you will get something valuable from this book.
If you already own the first edition, you may be wondering whether you need this
second edition, too. We think the answer is “yes.” The first edition had 245 recipes;
we kept 146 of those (with lots of editing in almost all cases), and added 192 new
ones, for a total of 338 recipes in this second edition. So, over half of the recipes in
this edition are completely new, and all the recipes are updated to apply to today’s
Python—releases 2.3 and 2.4. Indeed, this update is the main factor which lets us
have almost 100 more recipes in a book of about the same size. The first edition covered all versions from 1.5.2 (and sometimes earlier) to 2.2; this one focuses firmly on
2.3 and 2.4. Thanks to the greater power of today’s Python, and, even more, thanks
to the fact that this edition avoids the “historical” treatises about how you had to do
things in Python versions released 5 or more years ago, we were able to provide substantially more currently relevant recipes and information in roughly the same
amount of space.

Organization
This book has 20 chapters. Each chapter is devoted to a particular kind of recipe,
such as algorithms, text processing, databases, and so on. The 1st edition had 17

chapters. There have been improvements to Python, both language and library, and
to the corpus of recipes the Python community has posted to the cookbook site, that
convinced us to add three entirely new chapters: on the iterators and generators
introduced in Python 2.3; on Python’s support for time and money operations, both
old and new; and on new, advanced tools introduced in Python 2.2 and following

xx |

Preface
This is the Title of the Book, eMatter Edition
Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.


releases (custom descriptors, decorators, metaclasses). Each chapter contains an
introduction, written by an expert in the field, followed by recipes selected from the
online cookbook (in some cases—about 5% of this book’s recipes—a few new recipes were specially written for this volume) and edited to fit the book’s formatting
and style requirements. Alex (with some help from Anna) did the vast majority of the
selection—determining which recipes from the first edition to keep and update, and
selecting new recipes to add, or merge with others, from the nearly 1,000 available
on the site (so, if a recipe you posted to the cookbook site didn’t get into this printed
edition, it’s his fault!). He also decided which subjects just had to be covered and
thus might need specially written recipes—although he couldn’t manage to get quite
all of the specially written recipes he wanted, so anything that’s missing, and wasn’t
on the cookbook site, might not be entirely his fault.
Once the selection was complete, the work turned to editing the recipes, and to
merging multiple recipes, as well as incorporating important contents from many significant comments posted about the recipes. This proved to be quite a challenge, just
as it had been for the first edition, but even more so. The recipes varied widely in
their organization, level of completeness, and sophistication. With over 300 authors
involved, over 300 different “voices” were included in the text. We have striven to
maintain a variety of styles to reflect the true nature of this book, the book written by

the entire Python community. However, we edited each recipe, sometimes quite considerably, to make it as accessible and useful as possible, ensuring enough uniformity in structure and presentation to maximize the usability of the book as a whole.
Most recipes, both from the first edition and from the online site, had to be updated,
sometimes heavily, to take advantage of new tools and better approaches developed
since those recipes were originally posted. We also carefully reconsidered (and
slightly altered) the ordering of chapters, and the placement and ordering of recipes
within chapters; our goal in this reordering was to maximize the book’s usefulness
for both newcomers to Python and seasoned veterans, and, also, for both readers
tackling the book sequentially, cover to cover, and ones just dipping in, in “random
access” fashion, to look for help on some specific area.
While the book should thus definitely be accessible “by hops and jumps,” we nevertheless believe a first sequential skim will amply repay the modest time you, the
reader, invest in it. On such a skim, skip every recipe that you have trouble following or that is of no current interest to you. Despite the skipping, you’ll still get a
sense of how the whole book hangs together and of where certain subjects are covered, which will stand you in good stead both for later in-depth sequential reading, if
that’s your choice, and for “random access” reading. To further help you get a sense
of what’s where in the book, here’s a capsule summary of each chapter’s contents,
and equally capsule bios of the Python experts who were so kind as to take on the
task of writing the chapters’ “Introduction” sections.

Preface
This is the Title of the Book, eMatter Edition
Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.

| xxi


Chapter 1, Text, introduction by Fred L. Drake, Jr.
This chapter contains recipes for manipulating text in a variety of ways, including combining, filtering, and formatting strings, substituting variables throughout a text document, and dealing with Unicode.
Fred Drake is a member of the PythonLabs group, working on Python development. A father of three, Fred is best known in the Python community for singlehandedly maintaining the official documentation. Fred is a co-author of Python
& XML (O’Reilly).
Chapter 2, Files, introduction by Mark Lutz
This chapter presents techniques for working with data in files and for manipulating files and directories within the filesystem, including specific file formats

and archive formats such as tar and zip.
Mark Lutz is well known to most Python users as the most prolific author of
Python books, including Programming Python, Python Pocket Reference, and
Learning Python (all from O’Reilly), which he co-authored with David Ascher.
Mark is also a leading Python trainer, spreading the Python gospel throughout
the world.
Chapter 3, Time and Money, introduction by Gustavo Niemeyer and Facundo Batista
This chapter (new in this edition) presents tools and techniques for working
with dates, times, decimal numbers, and some other money-related issues.
Gustavo Niemeyer is the author of the third-party dateutil module, as well as a
variety of other Python extensions and projects. Gustavo lives in Brazil. Facundo
Batista is the author of the Decimal PEP 327, and of the standard library module
decimal, which brought floating-point decimal support to Python 2.4. He lives in
Argentina. The editors were delighted to bring them together for this introduction.
Chapter 4, Python Shortcuts, introduction by David Ascher
This chapter includes recipes for many common techniques that can be used
anywhere, or that don’t really fit into any of the other, more specific recipe categories.
David Ascher is a co-editor of this volume. David’s background spans physics,
vision research, scientific visualization, computer graphics, a variety of programming languages, co-authoring Learning Python (O’Reilly), teaching Python, and
these days, a slew of technical and nontechnical tasks such as managing the
ActiveState team. David also gets roped into organizing Python conferences on a
regular basis.
Chapter 5, Searching and Sorting, introduction by Tim Peters
This chapter covers techniques for searching and sorting in Python. Many of the
recipes explore creative uses of the stable and fast list.sort in conjunction with
the decorate-sort-undecorate (DSU) idiom (newly built in with Python 2.4),

xxii

|


Preface
This is the Title of the Book, eMatter Edition
Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.


while others demonstrate the power of heapq, bisect, and other Python searching and sorting tools.
Tim Peters, also known as the tim-bot, is one of the mythological figures of the
Python world. He is the oracle, channeling Guido van Rossum when Guido is
busy, channeling the IEEE-754 floating-point committee when anyone asks anything remotely relevant, and appearing conservative while pushing for a constant evolution in the language. Tim is a member of the PythonLabs team.
Chapter 6, Object-Oriented Programming, introduction by Alex Martelli
This chapter offers a wide range of recipes that demonstrate the power of objectoriented programming with Python, including fundamental techniques such as
delegating and controlling attribute access via special methods, intermediate
ones such as the implementation of various design patterns, and some simple
but useful applications of advanced concepts, such as custom metaclasses, which
are covered in greater depth in Chapter 20.
Alex Martelli, also known as the martelli-bot, is a co-editor of this volume. After
almost a decade with IBM Research, then a bit more than that with think3, inc.,
Alex now works as a freelance consultant, most recently for AB Strakt, a Swedish Python-centered firm. He also edits and writes Python articles and books,
including Python in a Nutshell (O’Reilly) and, occasionally, research works on
the game of contract bridge.
Chapter 7, Persistence and Databases, introduction by Aaron Watters
This chapter presents Python techniques for persistence, including serialization
approaches and interaction with various databases.
Aaron Watters was one of the earliest advocates of Python and is an expert in
databases. He’s known for having been the lead author on the first book on
Python (Internet Programming with Python, M&T Books, now out of print), and
he has authored many widely used Python extensions, such as kjBuckets and
kwParsing. Aaron currently works as a freelance consultant.
Chapter 8, Debugging and Testing, introduction by Mark Hammond

This chapter includes a collection of recipes that assist with the debugging and
testing process, from customizing error logging and traceback information, to
unit testing with custom modules, unittest and doctest.
Mark Hammond is best known for his work supporting Python on the Windows platform. With Greg Stein, he built an incredible library of modules interfacing Python to a wide variety of APIs, libraries, and component models such as
COM. He is also an expert designer and builder of developer tools, most notably Pythonwin and Komodo. Finally, Mark is an expert at debugging even the
most messy systems—during Komodo development, for example, Mark was
often called upon to debug problems that spanned three languages (Python,
C++, JavaScript), multiple threads, and multiple processes. Mark is also coauthor, with Andy Robinson, of Python Programming on Win32 (O’Reilly).

Preface
This is the Title of the Book, eMatter Edition
Copyright © 2007 O’Reilly & Associates, Inc. All rights reserved.

| xxiii


×