Book Review: The Success of Open Source

For the past few weeks I’ve been wrapped up reading Steven Weber’s
The Success of Open Source. Published in 2004, it is a look at what
the open source movement is and how it works, from the perspective of a
political scientist. This is no trite look at why people would choose to
give away the fruits of their labor. His analysis is serious and well
considered. He stresses several times that his goal is to ask questions
rather than answer them, but he does offer some observations about the
open source movement as a larger social movement and how it might spread
to other parts of the culture.

Weber starts out by explaining his goal for the book, to study the
political and economic foundations of open source communities and
processes. He makes two assertions, around which the rest of the book is
framed:

1. The open source phenomenon is an important “puzzle” for social
scientists who study cooperation.
2. OSS communities have been fundamentally impacted by the internet.

Early History

The second chapter covers the basic facts of the early history of open
source, well before it was called that. From the PACT compiler project
for IBM mainframes, through the failure of Multics, and the unintended
consequence of the AT&T consent decree that lead to the original
licensing terms for Unix, he covers some details that aren’t a part of
the usual story that includes DARPA, BSD, the fragmentation of the Unix
market, and FSF and the GNU project. The writing is engaging, and I
could recommend the book on this history section alone.

How Does OSS Work?

Chapter 3 tries to answer the question, “What is Open Source and How
Does It Work?”. It covers some essential software project
characteristics such as the division of labor between “analyst” and
“programmer” and how that historically lead to problems because the
designer of software was too far removed from the end-user.

The essence of software design, like the writing of poetry, is a
creative process. The role of technology and organization is to
liberate that creativity to the greatest extent possible and to
facilitate its translation into working code. Neither new technology
nor a “better” division of labor can replace the creative essence
that drives the project.

Weber builds on Brooke’s Law to say that success of a project isn’t
just about getting more people involved, but also about how they are
organized. He points out that open source is much more about the
process than the resulting product, which is an artifact of the
organization and creative energies of the participants. He identifies
four fundamental organization schemes that repeat in open source
projects:

1. A hierarchy, where patches flow up to a more or less central
maintainer, as with Linux.
2. The concentric circles used by the BSD project, in which
maintainers closer to the center have more rights and privileges, but
within a circle they are essentially equal.
3. The pumpkin holder or token-based system used by the developers
of Perl.
4. A democratic voting system, such as used to approve changes in
Apache.

One assertion Weber makes relates to the different cultures that
evolve around BSD vs. GPL-licensed projects. His claim is that core
developers in BSD-licensed projects do not depend as much on submissions
from the end user as GPL projects do. His evidence for this is the
various BSD operating systems and Linux. I think his sample size is too
small, though. I’m not convinced that the license has much to do with
“dependence” on contributions. I think the attitude of the core
developers, and their willingness to accept patches, is more important.

Evolution of Open Source

Chapter four talks about the “maturation” of three major projects
(Linux, BSD, and Apache) as they evolved in the 1990’s, the “golden age”
of open source. He covers several pivotal events during that period and
how the community identities gelled as a result of passing through
critical times like the fracturing of BSD and other Unixes, flame wars
and other crises among the Linux maintainers, and the conflict caused by
the “ideological passion” of Richard Stallman and the FSF. This chapter
was an interesting retrospective and it really pulled together a
cohesive picture of what happened that brought us to where we are today.

Motivation and Organization

Chapter five examines the microfoundations of open source made up of
the motivations of individual contributors. For example, he says that
open source developers self-select as a way to boost their egos by using
acceptance of their code as a “signal” of its quality to developers who
are not necessarily skilled enough to recognize quality on their own.

It is clearly the best programmers who have the strongest incentive
to show others just how good they are. If you are mediocre, the last
thing you want is for people to see your source code.

Ego boosting is one of 6 motivating factors he discusses, and is not
necessarily the most important for most developers.

Chapter six looks at how individual developers come together to form
groups and focus their creative energies with constructive
contributions. He studies the social and economic pressures for and
against forking a project, and comes to an interesting conclusion: The
leader of a project needs the fellow contributors more than they need
him. When a fork is created, the new leader has to convince potential
followers that the new project will be better or more popular than the
old one. So while forking may give the leader more visibility, that
only works if he is successful at attracting followers, in which case he
is just as likely to be a successful contributor to the original
project.

The Code That Changed The World

Weber begins his final chapter by comparing the impact of OSS to the
Japanese manufacturing innovations described in The Machine That
Changed the World : The Story of Lean Production
and re-emphasizing
the importance of process over product.

The Toyota “system” was not a car, and it was not uniquely Japanese.
… Open source is not a piece of software, and it is not unique to
a group of hackers.

This leads in to the rest of the conclusion, where he brings together
observations on intellectual property rights law, the limiting factors
for specialization and division of labor and how they impact
organizational structures, and the challenges of relating hierarchical
versus network organizations. He also offers some observations about how
open source techniques and attitudes can be applied directly in other
fields such as family practice medicine and genomics.

Recommendations

Weber covers a lot of material, and his writing is clear, for the most
part (especially for an academic :-). I enjoyed reading the first seven
chapters, but got a little bogged down in the chapter eight. I was
disappointed at his reluctance to draw more definite conclusions in a
few cases, but by remaining neutral he was able to focus on framing
several thought-provoking social and economic questions about the open
source movement.

Book Review: Python Essential Reference, Fourth Edition

Disclosure: I received a copy of this book for free from
Addison-Wesley as part of the PyATL Book Club.

I have a copy of the first edition of the Python Essential
Reference
that I picked up at IPC 8 back in 2000. It’s largely out of
date by now, given that it covered Python 1.5.2. But at the time it was
one of the few books I always kept close at hand for easy reference.
Over time my reference habits evolved away from paper references in
favor of online materials. Today I cleared a little space on my desk
for the fourth edition of PER by David Beazley, updated to cover Python
2.6 and 3.0.

Pound for pound

Just a little space, mind you, because the book is quite compact (717
pages in 6” x 9” x 1”, easily portable in a backpack or briefcase). This
book, diminutive though it may be, has more information of direct use to
Python programmers than many of the War and Peace-sized tomes you’ll
find elsewhere. If David keeps adding material at this rate, I’m going
to need a magnifying glass for the next edition.

The book is organized into three main sections: Language, Library, and
Extending and Embedding. There is a comprehensive index and the chapter
sequence places related information close together. You will not find
yourself flipping back and forth between an early “prose” chapter to a
later “reference” section.

Language

The language section can serve as a reference guide for Python, though
I think the first chapter title “Tutorial” is a little optimistic
based on the brevity. To be fair, the preface states right up front
that the book is not intended to be an introductory text.

This is not a book for learning Python. It is a book for writing
Python.

Library

The coverage of the standard library is where PER really shines. I
have a certain amount of interest in documenting the Python
standard library myself, so I was especially keen to review the
material here. I found it up to date, clearly explained, and
detailed. There is not a lot of sample code, but it is not entirely
devoid of examples. In most cases, the prose descriptions are
sufficient and eliminating code samples let David maintain a readable
style without adding filler material.

I thought I had internalized most of this material long ago, but I
learned a few things by re-reading it.

As the title implies, this is not an exhaustive reference guide. It
covers the essential information that will be useful to the most
readers. As a result, some of the modules are covered in less depth than
others. However, I tend to agree with where focus is placed. For
example, much more space is given to working with sqlite3 and databases
in general than some of the more esoteric modules like dis. The ast
module doesn’t appear at all.

Extending and Embedding

The Extending and Embedding section is one area where plenty of
example code is provided. Three techniques for creating extension
modules are covered: hand coding, ctypes, and SWIG (no surprise, since
SWIG is popular and was written by the author). Examples and commentary
are provided for all three approaches.

Going the other direction, embedding an interpreter in another
application, is also explained. All of the functions from the Python
library useful to someone trying to make their application scriptable
are listed and described, with some basic examples showing how to
communicate between the interpreter and your main application.

Recommendation

Due to the reference style, this should not be your first Python book.
It should absolutely be your second.

Book Review: The Practice of Programming

The Practice of Programming by Brian Kernighan and Rob Pike left me
a little disappointed. If I had read it at a time closer to when it was
originally published in 1999, I may have come away liking the book
better. There’s nothing wrong with the advice, and it reads well, but I
don’t think the examples are standing the test of time.

It is also possible that I’m just not a good representative of the
target audience. The book focuses a lot on C, C++, and Java, with most
of the examples in C. By 1999 I had moved over to mostly Python
development and wasn’t doing the sort of low-level coding in C that may
have made their examples more appealing.

This book is about the practice of programming – how to write
programs for real.

The sections that talk about the the social aspects of programming are
still valuable advice. For example, The code style issues and
recommendations in chapter 1 can apply to any language, and basically
boil down to write simple code meant to be read by another human being.
I agree with their premise that clear code is also frequently the most
optimal for runtime performance.

Chapter two talks about data structures and algorithms and covers a
simple implementation of quicksort. The explanation is good, but on
the other hand, does anyone still need to write their own sort
functions? Do we really not have any other algorithms that are complex
enough to be interesting, common enough to be worth reading about, and
understood well enough that we can analyze them? I found that
Beautiful Code (O’Reilly) had more interesting examples.

The Markov chain generator in chapter three did remind me of some code
I wrote in 1998 (in Python) to find phrases in a Framemaker document
that might be worth adding to the index. Maybe I should recreate some of
that code for Sphinx.

I skimmed over a lot of the performance section since it seemed
focused on low level C code and I try to avoid the sorts of situations
where I would need to implement my own version of strstr() just to
create a spam filter. This was another example of the text being dated.

I think the only real disagreement I had with anything they said was
in the section on debugging and error handling, where the authors
suggest saving exceptions for “truly exceptional” situations and don’t
think failing to open a file qualifies. I see two problems with this
position: First, failing to raise an exception means there are two error
reporting channels that need to be handled separately. Second, low-level
code would have to check for all error conditions instead of allowing
the exception to bubble up. Moving all error reporting to exceptions
means that low-level code is more streamlined (and easier to read)
because it doesn’t have to consider error handling unless it can work
around the problem (creating a missing file, for example). Maybe the
advice is based on bad exception support in C++/Java? Or maybe their
selected example is just not good. Searching for a substring may have
been a better choice for an example, since the missing value isn’t
actually an error.

If you’re new to professional programming, this book might be useful.
If you have some real-world experience under your belt, you will find
the same advice elsewhere in a more modern form. This book does talk
about “how to write programs for real”, just not the sorts of programs I
have ended up writing.

Book Review: IronPython in Action

IronPython in Action by Michael Foord and Christian Muirhead
covers the version of Python built to run on Microsoft’s CLR and
explains how to use it with the .NET framework.

Disclaimer: I received a review copy of this book from Manning
through the PyATL Book Club.

There are two target audiences for this book: experienced Python
developers wanting to learn .NET, and experienced .NET developers
wanting to learn Python. Both groups will find plenty of interesting
material and learn a lot. After some relatively basic introductory
chapters, the authors dig right in building a complex GUI application,
and then implementing a web interface for the same desktop application.

Along the way they introduce topics such as different programming
models in Python, navigating the MSDN documentation (very important for
understanding the scope of features available in .NET), packaging your
app for distribution under Windows, data persistence, XML parsing,
design patterns, automated testing, system administration, relational
databases, and two separate GUI libraries.

All of the code is clear, concise, and useful – there are no fluffy,
throw-away code snippets that fall short in the real world. While many
of the examples given are specific to IronPython or .NET, the techniques
being illustrated are definitely not.

I recommend this book for any Windows developer interested in learning
about Python, and for Python developers looking into deploying an
application under Windows. If you don’t fall into either of those
groups, I can still recommend that you pick up a copy for some excellent
advice on general programming topics and the solid example code.

Book Review: Hello, World!

Hello, World! Computer Programming for Kids and Other Beginners by
Warren and Carter Sande is an introduction to programming in general
(and Python specifically) aimed at pre-teens or young teens.

Disclaimer: I received a review copy of this book from Manning
through the PyATL Book Club.

Although the book is designed for a young audience, it is not
condescending as many kids books tend to be so it remains readable by
adults who need a very basic text on how computer programs work. And
by “basic” I mean from the ground up. The book covers using an editor
to create and modify program files, numbers, strings, variables,
branching, and looping. It doesn’t stop with basic topics, though. By
the mid-point of the book, the authors have built up to the point
where introducing PyGame and graphics programming isn’t a stretch, and
by the end of the book they have covered the GUI, animation, and sound
techniques needed to create two simple computer games.

The writing style is clear and friendly without coming off as cutesy.
Each chapter is relatively short, with review questions at the end in
the style of a text book (the answer guide is available in the
appendix). There is a liberal use of sidebars to break up longer
sections or highlight related digressions. And the authors also don’t
shy away from showing “broken” versions of programs as they evolve,
which teaches the reader how to understand error messages and debug
problems – an extremely important skill for a programmer.

I recommend checking out Hello, World! if you have a young person
in your life who is interested in learning about programming. Writing
the book was a father/son project, and reading it together seems
like a fun parent/child activity for the summer.

Book Review: The Economics of Iterative Development

The Economics of Iterative Software Development, by Walker Royce,
Kurt Bittner, and Mike Perrow, covers techniques for achieving more
predictable results with development projects.

Disclaimer: I received a review copy of this book through the PyATL
Book Club.

The goal of the book is to encourage adoption of iterative
development processes
, rather than the old-fashioned waterfall model
(does anyone really still use that model?). In the authors’ view,
iterative processes, where one builds a rough version of a product and
continually refine it over time, yield better results in terms of
predictable schedule and meeting real requirements. This isn’t a new
idea, and they make reference to agile processes such as XP and Scrum,
along with RUP. But the book isn’t necessarily a guide to a specific set
of processes. It’s framed as more of an argument in favor of the entire
class of techniques represented by modern development methodologies.

The authors clearly have experience with development processes and
evaluating and improving performance of teams. Royce is a VP from IBM’s
Rational Services group and a contributor to RUP. Bettner is a CTO at
Ivar Jacobsen Consulting and also contributed to RUP as well as
jazz.net. Perrow is a writer within the Rational group at IBM. Their
experience shows in the authoritative tone the book takes when
presenting best practices.

“day-to-day decisions are dominated by value judgements, cost
trade-offs, …”

They start from the premise that software development is less of an
engineering practice and more like a creative endeavor such as producing
a film. Only 1/3 of movies deliver on time and within budget, for
example. The comparison resonates with me since I have always viewed
development work as more creative than mechanical manufacturing or
engineering, although I was never quite sure other types of engineering
were as non-creative as they are portrayed.

The more I think about it though, the more I am inclined to see
managing software development as managing invention, which is even
farther from typical engineering than movie production. With software,
no two products or projects are exactly the same, so the “best
practices” we learn have to be adapted for every situation.

They go on to describe a generalized history of approaches to software
development processes. In the ‘60’s and ‘70’s the attitude was
“craftsmanship”, with lots of customization of tools and processes. In
the ‘80’s and ‘90’s the trend was towards an engineering approach, but
it still had a lot of innovation with new technology and techniques.
Recent techniques have paid more attention to risk, taking advantage of
automation.

Write code. Less of it. Mostly high-level languages.

The authors a couple of primary ways to reduce risk in development.
The first is using propose component-based and service-oriented
architectures. By isolating parts of the implementation from one another
and connecting them through established interfaces, you can iterate over
different parts improving as needed. There is also an emphasis on
reducing the amount of “human generated” source code, either through
high-level languages, off-the-shelf components, or code generation
(unsurprisingly, they specifically mention UML-based code generators).

This was about the point in the book where I realized that it was
missing the material needed to back up the assertions and claims being
made. With a name like “The Economics of Iterative Software
Development,” I expected to find more statistics and supporting research
material than is provided. I don’t disagree with any of the authors’
conclusions (indeed, they’re hardly new insights for anyone who has read
a couple of books on agile methodologies). The problem is if I was
trying to use this book to convince someone who did not already accept
the premise, I wouldn’t have any basis for an argument.

This lack of background was particularly evident in chapter 7, where
they talk about ways to “accelerate cultural change” to iterative
development by choosing a high-profile project instead of easing into it
with a pilot project. Their rationale is that the people assigned to
work on high-profile projects are typically the better performers
already, and if you get their buy-in, they will make the change work
because of their dedication. The trick is getting the buy-in in the
first place, of course.

I found the book interesting, well written, and worth reading. It
doesn’t quite stand on its own, though, if you’re looking for ammunition
to change your boss’ mind about process. If you have already decided to
go with an iterative process, it will reinforce your decision and
provide guidance to make it work (particularly in the appendix). But it
didn’t live up to my expectations, based on the title.

Updated: Check out my notes for this book on readernaut.com.

Book Review: Expert Python Programming

Neha Shaikh at Packt publishing sent me a copy of Tarek Ziadé’s new
book *Expert Python Programming* for review and I finally finished
it this weekend. Overall, I liked the book.

My first impression was, “Really, a chapter on installing Python in an
expert level book?” As it turns out, I’m glad that chapter was there
because I learned about setting a default module to be imported when the
interactive interpreter starts. I’m sure I’d heard of that feature at
some point, but I’ve never actually tried it until now.

The book continues by covering a range topics, alternating between
introducing new tools and promoting techniques to make your coding
better. The overall themes of the chapters progress from “make it
work” to “make it work right” and then “make it work faster” – just
as your development cycles should.

There were a few sections where I would have liked him to go deeper
into certain topics, but the author was clearly trying to introduce a
wide variety of topics and achieved that goal. There are plenty of
references to supplemental resources online, so it’s easy to keep
digging on your own. And, given the breadth of material covered,
there’s something here for everyone.

Although a few minor mistakes slipped through the editing process, Tarek
has an errata page online and is making corrections online.

In summary: Recommended.

Updated 1 Dec: Neha sent me a link to a sample chapter so you can
check it out before buying>

Book Review: Einstein: His Life and Universe, by Walter Isaacson

Mrs. PyMOTW gave me a copy of Einstein: His Life and Universe by
Walter Isaacson for Christmas last year, and I’ve finally managed to
find time to read it. If you are interested in history, science, and
Einstein in particular, I highly recommend the book.

General Notes

It took a couple of weeks of reading in the evenings, but that was
mostly short-burst sessions; the prose flows very smoothly. Isaacson
is a good story teller and has created an engaging view of Eintstein
as a man and as a scientist.

The book is organized in a semi-chronological way, with some
overlapping sections focusing on different aspects of the same time
period. This allows Isaacson to tell all of the stories clearly, yet
stitch them together by referring back to earlier quotes and
events. The end result is a coherent narrative that exposes the
personal side of Einstein as much as his professional or public
sides. I found this writing style easy to follow and quite effective.

Politics

Einstein was more politically active than I realized; I learned about
his strong ethical nature, and especially his activism against war and
racism. His rejection of tribalism and nationalism, along with the
regimented militarism of Germany’s schools at the time, led him to
become a pacifist, and then eventually support World War II to fight
fascism. While he had some socialist beliefs, he also rejected the
communism practiced in the Soviet Union, since it oppressed the people
there. He said, “Any government is evil if it carries within it the
tendency to deteriorate into tyranny”. After he settled in Princeton,
he repeatedly said that he would not live in a country where people
lacked the freedom of speech and thought.

From an early age, Einstein supported the establishment of a strong
global government as a way to prevent war. After the development of
nuclear weapons, he felt even more strongly that a true transnational
governing body should have control over such destructive power.

Science

Of course no biography of Einstein would be complete without
descriptions of his major scientific contributions. It is clear that
Isaacson enjoyed researching the scientific side of his subject as
much as, or more than, his personal life. He uses many of Einstein’s
own thought experiments to describe the work in terms that are easy
for a non-physicist to understand. Although true understanding
requires complex mathematics, this book does not.

Book Review: RESTful Web Services

As part of the Atlanta Python Users’ Group Book Club, I received a
copy of RESTful Web Services written by Leonard Richardson and Sam
Ruby, and published by O’Reilly Media. When we started the book club,
this was the first book I suggested we read. I had previously studied
some material on various blogs discussing REST, but I wanted a clear
description and more specific examples. The book provided both, and I
highly recommend reading it before planning your next web development
project.

Overview

Unlike many such books, RESTful does not depend on a single
programming language for examples. Much of the code is Ruby, but Python
and Java make up a respectable proportion of the material as well. Since
I was primarily interested in the design principles and “theory”, I did
not try to run any of the sample code myself. Others in the book club
have, so check the forum for more details if you are interested in
that aspect of the book.

The outline of the book follows a well thought-out progression of
topics from basic “programmable web” concepts to in-depth discussion of
Roy Fielding’s Representational State Transfer (REST) ideas and then
Resource Oriented Architecture (ROA), a natural extension of REST.
Intermediate chapters include discussions of best-of-breed tools for web
development and copious example code.

Outline

Chapter 1 is a foundation chapter for the remainder of the book. It
describes how the HTTP protocol works and breaks down the different
architectural styles discussed in the remaining chapters (REST, RPC, and
REST-RPC hybrid). The theme of this chapter, and perhaps the entire
book, is that “the human web is on the programmable web”. If something
is on the web, it is a service.

Chapter 2 introduces the concepts necessary to implement clients using
web services. The easily digestible example code (in several languages)
implements a client for the del.icio.us bookmarking service.
Bookmarks are an excellent choice for an example program, since the
information being managed is straightforward and everyone understands
the concept, even if they have never used del.icio.us directly. Chapter
2 also includes recommendations for client tools and libraries for
common languages. Basic HTTP access, JSON parsers, XML parsers
(including details about DOM, SAX, and pull-style parsers and when each
is appropriate), and WADL libraries are discussed, with best-of-breed
options presented for each language.

In chapter 3, the authors use Amazon’s S3 service design to point out
features of the REST architecture which make it different from RPC-style
APIs. The complexity of the examples increases to match the requirements
of the service, including advanced authentication techniques.

Resource Oriented Architecture, introduced in Chapter 4 and discussed
in an extended design example used through chapters 5 and 6, is perhaps
the most interesting part of the book. ROA is a set of design principles
which encourage you to think about your service in a specific way to
enable REST APIs. The principles are:

Descriptive URIs
URIs should convey meaning
Addressibility
Expose all information via URLs
Statelessness
The client maintains the application state so the server does not
have to.
Representations
Resources can have multiple representations, based on format, level
of detail, language, and other criteria
Connectedness
Link between related resources explicitly within the
representations, so the client does not have to know how to build
URLs
Uniform Interface
Use the HTTP methods (GET, PUT, POST, DELETE) as designed

To illustrate these principles, in chapters 5 and 6 the authors build
a web mapping service, similar to Google Maps. This detailed example
also serves as a way to introduce their ROA development process.

  1. Identify the data set to be managed by the service.
  2. Split up that data into resources.
  3. Name each resource with a URI.
  4. Expose a subset of the uniform interface for each resource,
    depending on what makes sense and what features are to be supported.
  5. Design representations to be passed from client to server.
  6. Design representations to be passed from server to client.
  7. Include links to other resources.
  8. Consider a typical course of events, to ensure completeness.
  9. Consider error conditions, to identify the HTTP response codes to be
    used.

Chapter 7 includes the implementation of a bookmarking service similar
to del.icio.us. The sample code uses Ruby extensively, and it was a
little more advanced than what I was prepared to absorb without a Ruby
primer. One important point made in the prose of the chapter is that
code frameworks may constrain your design by making certain choices for
implementation easier or harder.

Chapter 8 is a summary of the REST and ROA principles discussed in the
earlier chapters, and is an excellent reference once you’ve finished
reading the whole book. It is also suitable as a “Cliff’s Notes” version
of the material, if you don’t have time to read everything. If you want
to review the book before reading it, go to the book store and take a
look at this chapter.

While chapter 2 covered client implementation techniques, chapter 9 is
a survey of tools and aspects of web service implementations in
different languages. It covers topics such as XHTML, HTML5,
micro-formats, Atom, JSON, RDF, control flow patterns, and WSDL.

In chapter 10, the authors give an extensive comparison of ROA and
“Big Web Services” to argue that ROA is simpler, requires fewer tools,
and can even be more interoperable.

Chapter 11 is the requisite “How to use this with AJAX” chapter.
And the book wraps up in chapter 12 with a discussion of frameworks
for doing RESTful development in multiple languages. The coverage of
Django includes a dispatcher that decides how to handle the request
based on the HTTP method invoked.

Conclusion

Before reading “RESTful Web Services”, I had a somewhat cloudy notion
of REST and how to apply it. The book clarified what REST is and how to
apply it. It also offered an invaluable concrete process to follow when
implementing a web service using REST and ROA principles. I expect my
copy to see a lot of use and become dog-eared as I refer back to it
frequently.

PyATL Book Club

The Atlanta Python Users’ Group runs an online book club. We encourage
all Atlanta area Python developers to check the schedule on
PyATL.org and come down to a meeting. Anyone is free to join and
participate. For more reviews by members of the book club, check out
the forums or our Reviews List.