<?xml version="1.0" encoding="iso-8859-1" ?>

<rss version="2.0">
<channel>
<title>Eric.Weblog()</title>
<link>http://software.ericsink.com/</link>
<description>Thoughts about software from yet another person who invented the Internet</description>
<copyright>Copyright 2001-2010 Eric Sink. All Rights Reserved</copyright>
<generator>mine</generator>


<item>
<title>Obstacles to an enterprise DVCS</title>
<guid>http://software.ericsink.com/articles/vcs_trends.html</guid>
<link>http://software.ericsink.com/articles/vcs_trends.html</link>
<pubDate>Fri, 29 Jan 2010 07:00:00 CST</pubDate>
<description>
<![CDATA[
<p>On 26 January 2010 I gave a presentation to <a
href="http://softwaregr.org/">Software GR</a>.&nbsp; The talk was an overview of
several trends that we have seen in the version control tools market over the
last 40 years.&nbsp; I often like to follow a talk like this by publishing the same
content here on my blog in the form of a complete article.&nbsp; This time I think
I'll just eliminate a lot of the <a
href="http://www.urbandictionary.com/define.php?term=tl%3Bdr">tldr</a> problem
and summarize the highlights:</p>

<ul style='margin-top:0in' type=disc>
 <li  style='margin-bottom:6.0pt'>The two big trends in version
     control today are Integration and Decentralization.</li>
 <li  style='margin-bottom:6.0pt'>Integration is driven by ALM.&nbsp;
     It is the desire to have all tools used by a development team fully
     integrated together.</li>
 <li  style='margin-bottom:6.0pt'>Decentralization is driven by
     the recent wave of DVCS tools like Git and Mercurial.&nbsp; They offer
     compelling benefits such as performance, a different kind of scalability,
     and more flexible workflows.</li>
 <li  style='margin-bottom:6.0pt'>The ALM trend is happening in
     the enterprise market.&nbsp; Enterprises want everything integrated with
     everything else, and they want everything to support their ability to
     enforce process.</li>
 <li  style='margin-bottom:6.0pt'>The DVCS trend is happening in
     the open source community.&nbsp; Born of the legendary cat fight between
     BitKeeper and the Linux kernel developers, Git and Mercurial are maturing
     and gathering momentum at a remarkable rate.</li>
 <li  style='margin-bottom:6.0pt'>These two trends are going to
     clash in a big way.&nbsp; SourceGear's graphic designer drew me a nice diagram
     to depict this.<br>
     <img border=0 width=576 height=439 src="http://software.ericsink.com/articles/1775_image001.jpg"></li>
 <li  style='margin-bottom:6.0pt'>The two trends cannot stay
     separate.&nbsp; Each one has advantages which are too important for the other community
     to ignore.</li>
 <li  style='margin-bottom:6.0pt'>But the two trends and their
     respective communities are a bit like oil and water.</li>
 <li  style='margin-bottom:6.0pt'>Enterprises want tools that
     constrain.&nbsp; The open source community wants tools that empower.</li>
 <li  style='margin-bottom:6.0pt'>The benefits of a DVCS would
     be diluted by integrating it with a bunch of other tools that are highly
     centralized.</li>
 <li  style='margin-bottom:6.0pt'>Enterprises need a least a
     little centralization for things like user administration.&nbsp; In their eyes,
     complete decentralization without accountability and auditing features is
     a bug.</li>
 <li  style='margin-bottom:6.0pt'>Even as enterprise attitudes
     about open source are changing, that change is happening slowly, and the
     GPL (used by both Git and Mercurial) is still considered the scariest
     license.</li>
 <li  style='margin-bottom:6.0pt'>So Git and Mercurial are not even
     close to being enterprise-ready.&nbsp; Similarly, none of the leading
     enterprise ALM tools are even close to being a DVCS.</li>
 <li  style='margin-bottom:6.0pt'>I believe that the main enterprise
     ALM providers (IBM/Rational, Microsoft, Serena and Borland) will all
     attempt to add DVCS features to their products.&nbsp; At least two of these
     companies (IBM/Rational, in a talk by Jean-Michel Lemieux at the Rational
     Conference in 2009, and <a
     href="http://blogs.msdn.com/bharry/archive/2010/01/27/codeplex-now-supports-mercurial.aspx">Microsoft</a>)
     have already made public remarks about a desire to move in that direction.</li>
 <li  style='margin-bottom:6.0pt'>And I predict that they will
     all fail.&nbsp; It is impossible to turn any of these systems into a true DVCS
     without a nearly complete rewrite.&nbsp; The D in DVCS is not a feature which
     can be added.</li>
 <li  style='margin-bottom:6.0pt'>But all of them will do it
     anyway, by making compromises.&nbsp; They will try to add "just enough"
     Decentralization.&nbsp; Some of their customers will find the results to be
     sufficient.</li>
 <li  style='margin-bottom:6.0pt'>Meanwhile, the true DVCS tools
     will continue to move forward, but their progress toward credible ALM will
     be slow.&nbsp; Enterprise-level integration is grunge work, not the kind of coding
     that hackers do as a labor of love.&nbsp; Nobody does this stuff without
     getting paid.</li>
 <li  style='margin-bottom:6.0pt'>So these two trends will
     continue to be distinct for a while, but the pressure and tension between
     them will remain, and the areas of overlap are going to continue getting
     messier.</li>
</ul>
]]>
</description>
</item>

<item>
<title>Reflecting on our "SourceSafe Must Die" Campaign</title>
<guid>http://software.ericsink.com/entries/why_so_serious.html</guid>
<link>http://software.ericsink.com/entries/why_so_serious.html</link>
<pubDate>Fri, 15 Jan 2010 08:10:20 CST</pubDate>
<description>
<![CDATA[
<p align=center style='margin-top:0in;margin-right:1.0in;
margin-bottom:0in;margin-left:1.0in;margin-bottom:.0001pt;text-align:center'><i>&quot;Do
I really look like a guy with a plan? <br>
You know what I am? I'm a dog chasing cars. <br>
I wouldn't know what to do with one if I caught it. <br>
You know, I just ... do ... things.<br>
</i>-- <a href="http://www.imdb.com/title/tt0468569/">The Joker</a></p>

<p align=center style='text-align:center'><img border=0
width=320 height=298 src="http://software.ericsink.com/entries/1774_image001.jpg"></p>

<p>On the product side of marketing, planning has served me
well.</p>

<p>But on the marcomm side, you know, I just ... do ... things.</p>

<p>And since the whole point of marcomm is to draw attention, I
try to do things which are at least a little outrageous:</p>

<table class=MsoTableGrid border=0 cellspacing=0 cellpadding=0
 style='border-collapse:collapse'>
 <tr>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>&nbsp;</p>
  </td>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>&nbsp;</p>
  </td>
 </tr>
 <tr>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>Years ago we wanted to tell people about SourceOffSite as
  a telecommuting solution, so we gave away boxer shorts at trade shows and ran
  ads advising people to &quot;work in your skivvies&quot;.</p>
  </td>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p><img border=0 width=288 height=243
  src="http://software.ericsink.com/entries/1774_image002.jpg"></p>
  </td>
 </tr>
 <tr>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>&nbsp;</p>
  </td>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>&nbsp;</p>
  </td>
 </tr>
 <tr>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>When Microsoft did their Software Legends campaign, we
  spoofed it with <a href="http://www.notalegend.com/">Not A Legend</a>.</p>
  </td>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p><img border=0 width=288 height=238
  src="http://software.ericsink.com/entries/1774_image003.jpg"></p>
  </td>
 </tr>
 <tr>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>&nbsp;</p>
  </td>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>&nbsp;</p>
  </td>
 </tr>
 <tr>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>We started talking about Vault several months before its
  release.&nbsp; Since it was vaporware, we showed up at Tech-Ed with a <a
  href="http://www.ericsink.com/20020416.html">fog machine</a> in our booth.</p>
  </td>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p><img border=0 width=288 height=216
  src="http://software.ericsink.com/entries/1774_image004.jpg"></p>
  </td>
 </tr>
 <tr>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>&nbsp;</p>
  </td>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>&nbsp;</p>
  </td>
 </tr>
 <tr>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>When Vault was released, we promoted the product as a
  &quot;compelling replacement for Visual SourceSafe&quot; with a movie themed
  campaign.&nbsp; We hired <a href="http://vaultthemovie.com/">Hal Douglas</a> to
  voice our <a href="http://vaultthemovie.com/">trailer</a>.&nbsp; And yes, he
  started with "<a href="http://www.youtube.com/watch?v=fVDzuT0fXro">In a world</a>...".</p>
  </td>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p><img border=0 width=288 height=216
  src="http://software.ericsink.com/entries/1774_image005.jpg"></p>
  </td>
 </tr>
 <tr>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>&nbsp;</p>
  </td>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>&nbsp;</p>
  </td>
 </tr>
 <tr>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>We created a cartoon character called <a
  href="http://www.sourcegear.com/TEM/">The Evil Mastermind</a>, with twelve
  full-page print ads forming a complete story arc, plus two full-length comic
  books distributed at trade shows.</p>
  </td>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p><img border=0 width=288 height=216
  src="http://software.ericsink.com/entries/1774_image006.jpg"></p>
  </td>
 </tr>
 <tr>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>&nbsp;</p>
  </td>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>&nbsp;</p>
  </td>
 </tr>
 <tr>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p>When we were doing Guitar Hero in our trade show booth, we
  gave away actual <a
  href="http://blog.roub.net/2008/02/the_evil_mastermind_guitars_you_heard_me.html">custom
  guitars</a> with The Evil Mastermind graphics.</p>
  </td>
  <td valign=top style='padding:0in 5.4pt 0in 5.4pt'>
  <p><img border=0 width=288 height=98
  src="http://software.ericsink.com/entries/1774_image007.jpg"></p>
  </td>
 </tr>
</table>

<p>&nbsp;</p>

<p>We did those things because nobody had done them before.&nbsp; I'll
try just about anything.&nbsp; I just want to see what works.&nbsp; And afterward, I
usually report here on my blog about how these things go.</p>

<p>In the second half of 2009, we did a marketing campaign for
Vault.</p>

<h2>This is your brain on SourceSafe</h2>

<p>As I mentioned above, Vault was originally positioned to be
a compelling replacement for SourceSafe.&nbsp; Since our 5.0 release contains a new <a
href="http://sourcegear.com/fortress/video/handoff.html">Handoff</a> feature
which makes it easier than ever to make the transition, we decided to focus
this campaign on the product's original positioning, as a SourceSafe
replacement.</p>

<p>The creative on this campaign was ... edgy.&nbsp; In a nutshell, we
compared SourceSafe to an addiction.&nbsp; The ads were loosely patterned after billboards
and ads aimed at convincing people not to use illegal drugs.&nbsp; The basic idea
was to portray SourceSafe as something which might send your life into a
downward spiral toward a 12-Step program or rehab.</p>

<p>We knew from the beginning that some people were going to be
ticked off.&nbsp; We just weren't sure how many people and who.</p>

<p>We tried testing the ads by taking them home to our spouses
for feedback.&nbsp; This resulted in a few tweaks, but we didn't get any criticism
that caused us to change course.&nbsp; One guy's wife suggested that the ads would
be a better reflection of our industry if the models in the stock photos were
less attractive.&nbsp; <b>:-)</b></p>

<p>So we moved ahead.&nbsp; The first thing we did was run the ads
in MSDN magazine, which is published by an outside firm in close coordination
with Microsoft.&nbsp; Nobody complained, so we kept going.&nbsp; We ran banner ads on
several Microsoft-centric websites.</p>

<p><img width=288 height=482 src="http://software.ericsink.com/entries/1774_image008.jpg"
align=right hspace=12>And then we showed up in November at the Microsoft
Professional Developers Conference with this campaign as the theme for the
entire booth.&nbsp; We gave away hundreds of T-shirts saying "VSS Must Die".</p>

<p>You can see the whole thing at <a
href="http://vssisdead.com/">vssisdead.com</a></p>

<p>This campaign was darker and more negative than anything
we've done before.&nbsp; We knew we were pushing the envelope.</p>

<h2>Did we go too far?</h2>

<p>Well, we certainly didn't expect to win any Most Admired
Company awards by running a marketing campaign which portrays the users of our
competitor's product as drug addicts.&nbsp; :-)&nbsp; But the campaign was intended to be
funny, in a &quot;humor noir&quot; sort of way.</p>

<p>Somewhere during the execution of this campaign, I realized
that SourceSafe is very much like that dorky kid in high school that gets
teased by everybody.</p>

<p>Why do high schoolers pick on other kids?&nbsp; Because it
works.&nbsp; In high school, popularity is correlated with several factors, but one
of them is unkindness.&nbsp; The meaner you are (to the right people), the more
popular you are.&nbsp; Once the crowd has observed who is getting bullied by the
popular kids, others join in.&nbsp; Even if they don't know the kid, they start ripping
on them, just to try and identify with the &quot;in&quot; crowd.</p>

<p>As grownups, most of us know that this is reprehensible.&nbsp; No
kid deserves to be treated this way just because they're different.</p>

<p>But teenagers do it anyway.&nbsp; And they do it because it gets
them what they want.</p>

<p>This particular kid is an easy target.&nbsp; SourceSafe isn't
just a little bit dorky.&nbsp; We're talking pocket protector, greasy hair, and a
sport coat with elbow patches.</p>

<p>When it comes to poking fun, if any company is fair game,
it's Microsoft.&nbsp; And if any Microsoft product is safe to pick on, it's gotta be
SourceSafe.&nbsp; SourceSafe is the bullying target that <a
href="http://www.codinghorror.com/blog/archives/000660.html">everyone</a> can <a
href="http://www.highprogrammer.com/alan/windev/sourcesafe.html">agree</a> on.</p>

<p align=center style='text-align:center'><img border=0
width=533 height=333 src="http://software.ericsink.com/entries/1774_image009.jpg"></p>

<p>Dissing SourceSafe is so common that folks do it whether
they have used SourceSafe or not.&nbsp; Just like in high school, people join the
bashing just because they think it makes them look cool.</p>

<p>And SourceSafe has basically nobody defending it.&nbsp; When
emacs people get an attitude, the vi fans speak up.&nbsp; When Visual Studio fans
start trash talking, the Eclipse crowd starts showing features.&nbsp; But nobody
stands up for SourceSafe.&nbsp; People bash it, and SourceSafe just mopes down the
hall wearing flood pants and a shirt with the top button done.</p>

<p>So anyway, we chose to go negative on this, but we figured we
were in plenty of company.</p>

<p>And while I'm rationalizing and making lame excuses for
being a jerk, please note that SourceSafe is NOT REALLY A PERSON.</p>

<h2>Results</h2>

<p>In general, this campaign worked.&nbsp; People at PDC loved the
T-shirts.&nbsp; The click-through rate on the banner ads was the highest we have
ever seen.</p>

<p>Some people were offended, but we received far more positive
feedback than negative.</p>

<p>However, just like the awkward kid at school, SourceSafe
doesn't really deserve this.&nbsp; I've admitted it before, and I'll do it again now:&nbsp;
SourceSafe isn't really that bad.</p>

<p>In its day, SourceSafe was awesome.&nbsp; When it was created by
One Tree in the early 90s, it was nearly revolutionary.&nbsp; SourceSafe brought
ease of use in version control to a whole new level.&nbsp; Microsoft acquired this
product because it was outstanding.</p>

<p>Over the years, SourceSafe hasn't always aged well.&nbsp; Its
architecture didn't fit with the Internet.&nbsp; By relying on file sharing
protocols for network access, SourceSafe ended up with data corruption problems
that triggered a tidal wave of criticism.</p>

<p>But overall, SourceSafe has been a very successful piece of
software.&nbsp; Most of us would love to create something that has thousands of
happy users 15 years later.</p>

<p>At trade shows, people come up to us and ask why they should
switch from SourceSafe to Vault.&nbsp; We always respond by asking them if they are
happy with SourceSafe.&nbsp; If they say yes, we tell them not to switch.&nbsp; This conversation
has happened at every show I have ever attended.&nbsp; Lots of people use SourceSafe
every day without problems.</p>

<p>And despite those negative ads, I believe SourceGear is
doing more for SourceSafe users than any other company, including Microsoft.&nbsp;
We recently shipped <a href="http://www.sourcegear.com/sos/">SourceOffSite 5.0</a>,
a major upgrade with improved performance, new features, and a fancy new ribbon
UI.</p>

<h2>Eric, if this is an apology, it sucks.</h2>

<p>It's not.&nbsp; I don't owe SourceSafe an apology.&nbsp; SourceSafe
isn't a person.&nbsp; It's just a bunch of code.</p>

<p>And it's hard to imagine the need to apologize to Microsoft
as a company when so many of its employees stopped by the booth at PDC to join
the bashing.&nbsp; Some of them took shirts.</p>

<p>So I'm not really apologizing.&nbsp; I'm just sharing about my
experience and my reactions to it.</p>

<p>One exception:&nbsp; SourceSafe's principal author was Brian
Harry.&nbsp; In working through the Microsoft acquisition of Teamprise I came to
know Brian and developed a great deal of admiration for him.&nbsp; He is incredibly
smart, and his accomplishments are amazing, including SourceSafe, the CLR, and
Team Foundation Server.&nbsp; Brian, if our campaign caused you any personal
offense, please accept my public apology.</p>

<p>So anyway, there you have it.&nbsp; I went negative.&nbsp; And it
worked.</p>

<p>I'm not eager to do it again.</p>

<p>But I have no regrets.&nbsp; You know, I just ... do ... things.</p>

<h2>Credit and blame</h2>

<p>I wrote most of this piece in the first person, but the
truth is I deserve more blame than credit.&nbsp; If you were offended or
disappointed by this marketing campaign, blame me.&nbsp; On the other hand, if you
liked this marketing campaign, credit John Woolley and Paul Roub.&nbsp; The creative
work here was mostly theirs.</p>

<p>&nbsp;</p>
]]>
</description>
</item>

<item>
<title>Comments disabled</title>
<guid>http://software.ericsink.com/entries/haloscan_gone.html</guid>
<link>http://software.ericsink.com/entries/haloscan_gone.html</link>
<pubDate>Wed, 30 Dec 2009 09:34:55 CST</pubDate>
<description>
<![CDATA[
<h3>Short Version</h3>

<p>Sorry folks, until further notice, my blog does not support
comments.</p>

<h3>Long Version</h3>

<p>I've been using Haloscan for comments on this blog.</p>

<p>Haloscan is being turned off by the company that acquired
it.</p>

<p>That company offered a transition to a new service, but that
transition requires more effort than I am willing to invest (zero).</p>

<p>I downloaded all the old comments in some sort of XML file,
but doing anything with that file would require effort.</p>

<p>Investigating other ways of providing comments for this blog
would also require effort.</p>

<p>A day may come when the laziness of this blogger fails, when
I forsake my procrastination and break all bonds of inertia, but it is not this
day.</p>
]]>
</description>
</item>

<item>
<title>My excuses for not blogging about the Microsoft/Teamprise deal</title>
<guid>http://software.ericsink.com/entries/microsoft_teamprise.html</guid>
<link>http://software.ericsink.com/entries/microsoft_teamprise.html</link>
<pubDate>Fri, 13 Nov 2009 12:51:17 CST</pubDate>
<description>
<![CDATA[
<p>People keep asking me why I haven't blogged about the
Microsoft acquisition of our Teamprise division.</p>

<p>Well, it's kind of complicated.</p>

<p>It all started three days before the signing of the deal
when my laptop died.&nbsp; And I mean it's really dead.&nbsp; It won't boot, from any
device.</p>

<p>Great timing, eh?</p>

<p>Fortunately, all I really needed for working on the deal was
email and Microsoft Word, so I just switched over to my netbook.</p>

<p>I completely forgot about the MacBrick Pro until this
weekend when I realized that the press coverage was going to hit Monday morning
and the only installation of my blogging software was trapped in a lifeless
piece of aluminum on my office floor.</p>

<p>So I ran out and bought a new Mac laptop, hoping to get
everything going in time to write my blog entry for Monday.</p>

<p>And then I figured, heck, as long as I was doing a
completely new setup, why not start off right with an Intel X-25M instead of
the stock hard disk?</p>

<p>Getting everything configured wasn't too difficult, but the SSD
ended up costing me a lot of time because Monday morning I had to tell the
other coders on my project team that I can do a full build in 24 seconds.&nbsp; All
that gloating killed a couple of hours, and by the time I got back to my desk I
figured I should check and see how the press coverage was going.</p>

<p>Whoa.&nbsp; The Microsoft PR machine is amazing!&nbsp; They got over
230 articles published about the acquisition.&nbsp; I couldn't get that kind of
press coverage without committing a felony.</p>

<p>Right about then I got into an argument with my daughter
because I wanted her to walk four blocks from her school over to my office and
she said it was too far.&nbsp; I wish my Mom would call me more often to tell me how
much she appreciates the fact that I was a model teenager who never caused my
parents any trouble.</p>

<p>So anyway, with hundreds of people already writing about the
deal, I needed a new angle.&nbsp; I figured I had to come up with something cool or
not post anything at all.&nbsp; So I started drafting something, but I got stuck
when I couldn't find anybody to confirm whether Kanye West jokes are still
funny or not.&nbsp; (Yo Eric!&nbsp; I'm really happy for you and I'mma let you finish,
but Groove was the greatest Microsoft acquisition of all time!)</p>

<p>A short time later our sales VP walked in to let me know
that SourceGear's name was mentioned in the "New York Freaking Times".&nbsp; Cool.</p>

<p>The next morning I resumed working on this blog entry, or
rather, on the infrastructure to support same.&nbsp; I restored the VMware image
from my Time Machine disk, but I couldn't get the product serial number to
work.&nbsp; So I figured maybe it was one of those stupid Snow Leopard bugs that
everybody is complaining about, and decided to upgrade to 10.6.2.&nbsp; But that
took hours, because apparently every Steve Jobs disciple on the planet was
upgrading their Mac on the same day, so Apple's download servers were really slow.</p>

<p>While I was doing that, the aforementioned daughter asked me
to drive her to the mall and I refused. &nbsp;So she walked FIVE MILES to get there
by herself.</p>

<p>Keep that in mind next time you're having trouble
understanding the mind of a teenager:&nbsp; FIVE MILES to the mall is a shorter walk
than FOUR BLOCKS to your Dad's office.</p>

<p>Suddenly I realized it had been a whole day since I told any
of my coworkers that I can build the whole tree in 24 SECONDS, and well, you
know what happened to the rest of my morning.</p>

<p>So then I walked across the street to the coffee shop to
pick up a copy of the local newspaper.&nbsp; As usual, they did a very nice job on
the press coverage for us.&nbsp; And, as usual, our story was below the fold because
the main story of the day was about farming.</p>

<p>Keep that in mind next time you're having trouble
understanding the mind of Champaign:&nbsp; If you want your big-time corporate
acquisition to be the top story, make sure you work something about corn yields
into the deal.</p>

<p>For those of you keeping score at home, that's 232 points
for the Microsoft PR team and one point for me.&nbsp; I'm sure there's some PR guy
at Microsoft trying to take credit for Don Dodson's piece in the Tuesday
morning edition of the Champaign-Urbana News-Gazette, but that one was MINE.&nbsp; They
may be able to place stories in the New York Freaking Times, but I've got
connections too.</p>

<p>I'm not kidding -- building this project's code on some
machines can take several minutes, but my new Mac can do a whole build in 24
seconds.&nbsp; The X-25M is way cool.&nbsp; I am now seriously considering putting a $700
SSD into my $300 netbook.</p>

<p>This morning I gave up and paid VMware for a new serial
number, and here I am writing in my blog once again.</p>

<p>As I write this, the realization hits me.&nbsp; I got frustrated
because I couldn't move my VMware installation to my new machine.&nbsp; My company just
had a liquidity event.&nbsp; I could have paid VMware $79 to solve the problem, but
instead, I decided it would be better to thrash on that problem for three days
and THEN pay the $79.&nbsp; Yep, I'm in the big leagues now.</p>

<p>So anyway, if you haven't heard, Microsoft announced Monday
morning that it has acquired our Teamprise division.&nbsp; I think the deal ended up
being a nice win for both Microsoft and SourceGear.</p>

<p>I'll be at PDC next week.&nbsp; Stop by the SourceGear booth and
say hi.</p>

<p></p>
]]>
</description>
</item>

<item>
<title>Vault 5.0 has shipped</title>
<guid>http://software.ericsink.com/entries/vault5.html</guid>
<link>http://software.ericsink.com/entries/vault5.html</link>
<pubDate>Thu, 30 Jul 2009 09:52:31 CST</pubDate>
<description>
<![CDATA[
<p>Hooray!&nbsp; <a
href="http://www.sourcegear.com/vault/downloads.html">Vault 5.0</a> has <a
href="http://vaultblog.sourcegear.com/articles/2009/07/29/vault-5-0-and-fortress-2-0-released">shipped</a>!</p>

<p>The <a
href="http://www.sourcegear.com/vault/releases/5.0.html">release notes</a>
contain an overview of what's new.</p>

<p></p>
]]>
</description>
</item>

<item>
<title>Vault 5.0 Beta 2</title>
<guid>http://software.ericsink.com/entries/vault5_beta2.html</guid>
<link>http://software.ericsink.com/entries/vault5_beta2.html</link>
<pubDate>Mon, 06 Jul 2009 12:05:54 CST</pubDate>
<description>
<![CDATA[
<p>Last week's beta 2 release means that the long-awaited version
5.0 of SourceGear Vault is coming soon.&nbsp; This includes the regular edition of
Vault as well as the "much more better" edition which has integrated
bug-tracking.&nbsp; (The latter product is actually called SourceGear Fortress and
carries the version number 2.0, but its heart is still Vault.)</p>

<p>This release has numerous improvements, but for now I want
to highlight one new feature which we call "VSS Handoff".&nbsp; Basically, Handoff
is a simpler and faster way of importing a SourceSafe database.&nbsp; Instead of
converting all your old history, Vault simply wraps your VSS database and makes
it part of your Vault repository.&nbsp; After that, all new checkins will go into
the regular Vault database.&nbsp; For history operations which need to access stuff
that happened before the Handoff, the VSS database is seamlessly referenced.&nbsp;
The transition from SourceSafe can't get more painless than this.</p>

<p>Bottom line:&nbsp; If you are still using SourceSafe, Vault 5
will remove your last excuse.</p>

<p>In fact, shortly after Vault 5 is released, I plan to go on
a world tour.&nbsp; If you are still clinging to SourceSafe, I will visit your
office.&nbsp; I will taunt you mercilessly and suggest an MRI to confirm that there
is nothing between your ears but bone.&nbsp; And I will drench you with my new <a
href="http://www.amazon.com/Super-Soaker-Infusion-Flash-Blaster/dp/B000BXJ0JS">Super
Soaker Max Infusion Flash Flood Water Blaster</a>.&nbsp; </p>

<p>And I will be morally justified.&nbsp; You've been given many
opportunities to switch to any one of several dozen competent version control
tools.&nbsp; And yet, it's 2009 and you're still using SourceSafe.&nbsp; Surely you
didn't expect this to end well?</p>

<p>BTW, for more details about Vault 5, check out the recent blog
entries by <a
href="http://vaultblog.sourcegear.com/articles/2009/07/01/beta-2-is-out">Jeremy</a>
or <a href="http://blog.roub.net/2009/07/now_available_vault_5_beta_2_f.html">Paul</a>.&nbsp;
<b><span style='font-family:"Courier New"'>:-)</span></b></p>

<p></p>
]]>
</description>
</item>

<item>
<title>IBM Rational Software Conference</title>
<guid>http://software.ericsink.com/entries/rsdc_2009.html</guid>
<link>http://software.ericsink.com/entries/rsdc_2009.html</link>
<pubDate>Thu, 28 May 2009 13:12:03 CST</pubDate>
<description>
<![CDATA[
<p>Anybody attending the <a
href="http://www-01.ibm.com/software/rational/rsdc/">Rational Software
Conference</a> in Orlando next week?</p>

<p>I've been making very last-minute plans to be there for some
meetings, but I'll have some free time, and it's always cool to connect a face
with an email address.&nbsp; So if you're a reader of my blog and will be at the
Rational conference next week, drop me an <a
href="http://software.ericsink.com/about_author.html">email</a>.</p>

<p>And yes, yes I know this blog entry should really have been
a tweet.&nbsp; I just haven't gotten into the Twitter thing at all yet, but this
very moment is the first time I've thought maybe I should.&nbsp; :-)</p>
]]>
</description>
</item>

<item>
<title>Time and Space Tradeoffs in Version Control Storage</title>
<guid>http://software.ericsink.com/entries/time_space_tradeoffs.html</guid>
<link>http://software.ericsink.com/entries/time_space_tradeoffs.html</link>
<pubDate>Tue, 28 Apr 2009 08:00:00 CST</pubDate>
<description>
<![CDATA[
<p>Storage is one of the most difficult challenges for a
version control system.&nbsp; For every file, we must store every version that has
ever existed.&nbsp; The logical size of a version control repository never shrinks.&nbsp;
It just keeps growing and growing, and every old version needs to remain
available.</p>

<p>So, what is the best way to store every version of
everything?</p>

<p>As we look for the right scheme, let's remember three things
we consider to be important:</p>

<ul style='margin-top:0in' type=disc>
 <li >Data integrity is paramount.&nbsp; In a version control tool,
     nothing can be considered to be more important than guarding the safety of
     the data.<br>
     <br>
 </li>
 <li >Performance is critical.&nbsp; Software developers have about
     as much patience as a German Shepherd sitting in front of a pot roast.<br>
     <br>
 </li>
 <li >Space matters too.&nbsp; We're going to be storing lots of
     data, much of which is being kept almost entirely for the purpose of
     archiving history.&nbsp; We'd prefer to keep this archive as compact as
     possible.</li>
</ul>

<p>In this blog entry I will report the results of some
exploration I've been doing.&nbsp; I am experimenting with different ways of storing
the full history of one source code file.&nbsp; In this case, the file comes from
the source code for SourceGear Vault.&nbsp; It has been regularly edited for almost
seven years.&nbsp; There are 508 versions of this file.</p>

<p>As I describe the various things I have tried, a running
theme will be the <a href="http://en.wikipedia.org/wiki/Time-space_tradeoff">classic
tradeoff</a> of space vs. speed.&nbsp; In physics, we know that matter and energy
are interchangeable.&nbsp; In computer science, we know that time and space are
interchangeable.&nbsp; Usually, we can find a way to make things faster by using
more space, or make things smaller by taking more time.</p>

<p>As I said, I'll be storing 508 versions of the same file.&nbsp;
It's a C# source code file.&nbsp; For each attempt, I will report two things:</p>

<ul style='margin-top:0in' type=disc>
 <li >The total amount of space required to store all 508
     versions.<br>
     <br>
 </li>
 <li >The total amount of time required to retrieve (or
     decompress or decode) all 508 versions, one at a time.</li>
</ul>

<p>Before we get started, a few caveats:</p>

<ul style='margin-top:0in' type=disc>
 <li >I realize that these experiments would yield different
     results for a different kind of file.&nbsp; If you're storing source code,
     there might be some things here you can apply.&nbsp; If you're storing JPEG
     images, not so much.<br>
     <br>
 </li>
 <li >All these experiments were done on my Mac Book Pro laptop.&nbsp;
     The CPU is a Core 2 Duo, which I consider to be decently fast.&nbsp; But like
     most laptops, this machine has an I/O system which I consider to be
     quasi-crappy.&nbsp; I would probably get somewhat different results if I were
     running on a more serious piece of hardware.</li>
</ul>

<p>OK, how should we store these 508 versions of the file?</p>

<h3>No compression at all</h3>

<p>As a first attempt, let's just store them.&nbsp; No compression
or funky encoding.&nbsp; Each of the 508 versions will be stored in full and
uncompressed form.</p>

<p>This is the starting point, even if it is not very
practical.</p>

<p style='margin-left:.5in'>Size:&nbsp; 112,643 KB</p>

<p style='margin-left:.5in'>Time:&nbsp; 2.5 s</p>

<p><b><span style='font-size:10.0pt;font-family:"Courier New"'>#ifdef
DIGRESSION</span></b></p>

<p>Yes, dear reader, I admit that this file is far too long.</p>

<p>You can do the math.&nbsp; If the archive takes 112 MB and there
are 508 versions, then each one is 230 KB.&nbsp; That's pretty big for a source code
file.</p>

<p>Actually, it's worse than you think.&nbsp; The 230 KB figure is
just the average.&nbsp; The first version of the file is around 90 KB.&nbsp; The latest
version is over 400 KB.&nbsp; </p>

<p>In our defense, I'd like to point out that this piece of code
needs to stay compatible with .NET 1.1, so the entire class must be in a single
file.&nbsp; However, I'd still have to answer to the charge of "First Degree Failure
to Refactor".&nbsp; Fine.&nbsp; I'll have my attorney contact you to plead out on a
lesser charge.&nbsp; I'm thinking maybe "Third Degree Contributing to the
Delinquency of an Intern", or something like that.</p>

<p><b><span style='font-size:10.0pt;font-family:"Courier New"'>#endif</span></b></p>

<p>This "full and uncompressed" format uses an awful lot of
space, but it is also the fastest.&nbsp; We will find ways of making this smaller, but
all of those ways will be slower.</p>

<p>The relevant questions are:</p>

<ul style='margin-top:0in' type=disc>
 <li >How much smaller?</li>
 <li >How much slower?&nbsp; </li>
</ul>

<p>Some solutions will allow us to make this a lot smaller and
only a little slower.&nbsp; Those are interesting.&nbsp; Other possibilities will be only
a little smaller but a lot slower.&nbsp; Those are not so interesting.</p>

<h3>Simple compression</h3>

<p>OK, for our next idea, let's just compress every version
with zlib.</p>

<p style='margin-left:.5in'>Size:&nbsp; 22,516 KB</p>

<p style='margin-left:.5in'>Time:&nbsp; 4.0 sec</p>

<p>The results of this idea are surprisingly impressive.&nbsp; The
archive is over 80% smaller, and only about 60% slower.&nbsp; That's darn good,
considering that I didn't have to be terribly clever.</p>

<p>This tradeoff is probably worth it.&nbsp; In fact, it establishes
a new baseline that might be tough to beat.</p>

<p>How do we get better than this?</p>

<h3>Deltas</h3>

<p>Instead of just compressing every file independently, we
could store things as deltas.&nbsp; Think of a delta as simply the difference
between one version and the next.</p>

<p>Compression with zlib takes one standalone thing and makes
an equivalent standalone thing which is smaller.</p>

<p>In contrast, a delta is a representation of the differences
between two files.&nbsp; Suppose that somebody takes file X and makes a few changes
to it, resulting in file Y.&nbsp; With a delta algorithm, we could calculate the
delta between X and Y, and call it D.&nbsp; Then, instead of storing Y, we can store
D.</p>

<p>The nice thing here is that D will be approximately the size
of the edits, regardless of the size of the two files.&nbsp; If X was a 100 MB file
and Y was the same file with an extra 50 bytes appended to the end, then D will
be somewhere around 50 bytes,</p>

<p>A delta is a concept which might be implemented in a lot of
different ways.&nbsp; In my case, the delta algorithm I am using is VCDIFF, which is
described in <a href="http://www.faqs.org/rfcs/rfc3284.html">RFC 3284</a>.&nbsp; We
have our own implementation of VCDIFF.&nbsp; Other implementations include <a
href="http://xdelta.org/">xdelta</a> and <a
href="http://code.google.com/p/open-vcdiff/">open-vcdiff</a>.</p>

<p>The important thing to remember about deltas for storage is
that you must have the reference item.&nbsp; D is a representation of Y, but only if
you have X handy.&nbsp; X is the reference.</p>

<p>OK, it should be obvious that this concept can be helpful in
storing a repository, but how do we set things up?</p>

<h3>One big delta chain</h3>

<p>As a first attempt, let's store all 508 versions as a big
chain of deltas.&nbsp; Every version is stored as a delta against the version just
before it.&nbsp; Version 1 is the reference, and is the only version that is not
stored as a delta.&nbsp; </p>

<p style='margin-left:.5in'>Size:&nbsp; 7,682 KB</p>

<p style='margin-left:.5in'>Time:&nbsp; Way too long to wait</p>

<p>Wow -- this is really small.&nbsp; It's over 93% smaller than the
full/uncompressed form.&nbsp; It'll be hard to find a general purpose approach that
is smaller than this.</p>

<p><img border=0 width=500 height=84
src="http://software.ericsink.com/entries/1768_image001.jpg"></p>

<p>But good grief this is slow.&nbsp; Fetching version 508 takes an
eternity, because first you have to construct a temporary version of 507.&nbsp; And to
construct version 507, you first have to construct a temporary version of 506.&nbsp;
And so on.</p>

<h3>Key frames</h3>

<p>Let's try something else.&nbsp; The problem with the chaining
case above is that retrieving version 508 requires us to go all the way back to
version 1, which is incredibly inefficient.&nbsp; Instead, let's insert "key frames"
every 10 versions.&nbsp; We borrow this idea from the video world where compressed
video streams store every frame as a delta, but every 10 seconds they insert a
full, uncompressed frame of video.</p>

<p>By using key frames with chaining deltas, we can cut the
time required to fetch the average version of the file.&nbsp; For example, with a
key frame every 10 versions, we get most of the benefits of chaining, but in
the worst case, we only need 9 delta operations to retrieve any version.</p>

<p style='margin-left:.5in'>Size: 18,024 KB</p>

<p style='margin-left:.5in'>Time: 41.0 sec</p>

<p>This is better, but still not very good.&nbsp; The compression here
isn't much better than zlib, and the perf is still a lot worse.&nbsp; Compared to
zlib, we don't want to pay a 10x speed penalty just to get 20% better
compression.</p>

<p>All the key frames are stored as full and uncompressed
files, and they're taking up a lot of space.&nbsp; Maybe we should zlib those key frames?</p>

<p style='margin-left:.5in'>Size: 9,092 KB</p>

<p style='margin-left:.5in'>Time: 42.7 sec</p>

<p>Now at least the compression is starting to look
interesting.&nbsp; This is less than half the size of the zlib case, and 91.9%
smaller than the full form, which is a level of compression that is probably
worth the trouble.&nbsp; But the overall perf is still quite slow.&nbsp; In fact, it's
even slower here than plain chaining with key frames, because we have to
un-zlib the key frame.</p>

<h3>Flowers</h3>

<p>The big problem here is that chains of deltas are killing
our performance.&nbsp; Chained deltas can be used to make things very small because
each delta matches up nicely with one set of user edits.&nbsp; But chained deltas
are slow because we need multiple operations to retrieve a given file.</p>

<p>Another approach would be to use each reference for more
than one delta.&nbsp; I call this the flower approach.&nbsp; With a flower, we deltify a
line of versions by picking one version (say, the first one) and using it as
the reference to make all the others into deltas.</p>

<p>Flower deltas should be much faster, since any file can be reconstructed
with just one undeltify operation.</p>

<p><img border=0 width=340 height=203
src="http://software.ericsink.com/entries/1768_image002.jpg"></p>

<p>So let's try to flower all 508 versions using version 1 as
the reference for all of them.</p>

<p style='margin-left:.5in'>Size:&nbsp; 35,851 KB</p>

<p style='margin-left:.5in'>Time:&nbsp; 10.9 sec</p>

<p>As expected, the performance here is much better.</p>

<p>But the overall space savings is lousy.&nbsp; Only version 2 was
based directly on version 1.&nbsp; Every version after that has less and less in
common with version 1, so the delta algorithm can't draw as much stuff from the
reference.</p>

<p>This particular approach isn't going to win.&nbsp; Plain zlib is both
smaller and faster.</p>

<h3>Flowers with key frames</h3>

<p>Maybe we should try the flower concept with key frames?</p>

<p>Like before, every 10 frames go together as a group.&nbsp; But instead
of chaining, we're going to run each group as a flower.&nbsp; The first version in
the group will serve as the reference for the other 9.&nbsp; We can reasonably
assume that the deltification of frame 10 won't be as good as frame 2, but
hopefully 10 and 1 still have enough in common to be worthwhile.</p>

<p style='margin-left:.5in'>Size:&nbsp; 18,648 KB</p>

<p style='margin-left:.5in'>Time:&nbsp; 12.2 sec</p>

<p>Wow.&nbsp; This looks a lot better than chaining.&nbsp; The space used
is about 17% smaller than zlib, but instead of being 10 times slower, it's only
3 times slower.</p>

<p>Of course, we can use the same trick we tried before.&nbsp; Let's
zlib all those key frames.</p>

<p style='margin-left:.5in'>Size:&nbsp; 9,716 KB</p>

<p style='margin-left:.5in'>Time:&nbsp; 13.6 sec</p>

<p>This seems like a potentially useful spot.&nbsp; It's less than
half the size of zlib.&nbsp; The perf still a lot slower than zlib, but at only
about 3X slower, the tradeoff is the best we've seen so far.</p>

<p>OK.&nbsp; So we've made a lot of progress on saving space, but 3X
slower than zlib still seems like a high price to pay.&nbsp; Do we really want to
make that trade?&nbsp; Do we have to?</p>

<h3>Some things get retrieved more often than others</h3>

<p>Let's look at the patterns for how this data is going to be
accessed.</p>

<p>I've been reporting the total time required to fetch all 508
versions of the file.&nbsp; However, this benchmark doesn't reflect real usage very
well at all.&nbsp; In practice, the recent stuff gets retrieved a LOT more often
than the older stuff.&nbsp; Most of the time, developers are updating their working
copy to whatever is latest.</p>

<p>As a rough guess, I'm going to say that version 508 gets
retrieved twice as often as 507, which gets retrieved twice as often as 506,
and so on.&nbsp; A timing test based on that assumption gives us results something
like this:</p>

<p style='margin-left:.5in'>Full&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1.1
sec</p>

<p style='margin-left:.5in'>Zlib&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1.7
sec</p>

<p style='margin-left:.5in'>One big flower&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 4.0
sec</p>

<p style='margin-left:.5in'>Flower with key frames&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 5.1
sec</p>

<p style='margin-left:.5in'>Chain with key frames&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 24.5
sec</p>

<p>Not too surprising.</p>

<p>In the spirit of optimizing performance for the most common
operations, why not keep all the more recent versions in a faster form?&nbsp; We
could still use something more aggressive for the older stuff, but we can
probably get a nice performance boost if we just refuse to use deltification
for the most recent 10 versions of the file.</p>

<p>But how should we store those 10 versions?&nbsp; In full format?&nbsp;
Or zlib?&nbsp; This is an arbitrary choice with a clear tradeoff.&nbsp; For now, I choose
zlib.&nbsp; If we wanted a little more speed at the expense of using a little more
space, we could just keep those 10 versions in full form.</p>

<p>By choosing zlib for the most recent 10 revisions, now my "get
the recent stuff" benchmark runs in 1.7 seconds no matter what scheme I use.</p>

<p>But we still care about performance for the case where
somebody fetches an older version, even if that fetch doesn't happen as often.&nbsp;
That's the point of version control storage.&nbsp; Every version has to be
available.&nbsp; And when somebody does fetch version 495, we want our version
control system to still be reasonably fast.</p>

<h3>Reversing the direction of the chains</h3>

<p>Since the more recent versions are retrieved more often,
obviously, our chains are all going the wrong direction.&nbsp; If we had them go the
other way, then retrieval would get slower as the versions get older instead of
as the versions get newer.</p>

<p>But this approach doesn't lend itself well to the way
version control repositories naturally grow in the wild.&nbsp; In these tests, I
have mostly ignored the issues of constructing each storage scheme.&nbsp; I've
already got all 508 versions, so I'm just fiddling around with different
schemes of storing them all, comparing size and retrieval time.</p>

<p>In practice, those 508 versions happened one at a time, in
order.&nbsp; If we're going to store the versions with backward chains, then each
time we commit a new version of the file, we're going to need to re-encode something
that was previously stored.&nbsp; This makes the commit operation slower.&nbsp; It is
also a questionable idea from the perspective of data integrity.&nbsp; The safest
way to maintain data is to not touch it after it has been written.&nbsp; Once it's
there, leave it alone.</p>

<p>One case where we <i>might</i> want to be a bit more liberal
toward rewriting data is in a "pack" operation, such as the one Git has.&nbsp; It
wouldn't be terribly crazy to consider a standalone pack operation in a DVCS to
be better than rewriting data for each commit, for several reasons:</p>

<ul style='margin-top:0in' type=disc>
 <li >It allows us to keep commit fast.<br>
     <br>
 </li>
 <li >Since pack would be done offline, its implementation can
     be focused more on data integrity and space savings than on performance.<br>
     <br>
 </li>
 <li >Since the pack code can be separated from the commit code,
     all the risky code can be kept together where it is easier to maintain.<br>
     <br>
 </li>
 <li >Since the pack operation is separate from commit, a user
     that does not want to run pack does not have to.<br>
     <br>
 </li>
 <li >A pack operation in a DVCS is happening on just one
     instance (clone) of the repository, not on the only copy.</li>
</ul>

<p>Anyway, a pack operation would allow us to use storage
schemes that do not work well on the fly, incrementally updating as each
version comes in.</p>

<h3>Visualizing the results</h3>

<p><img border=0 width=550 height=370
src="http://software.ericsink.com/entries/1768_image003.gif"></p>

<p>This plot makes it easier to see which schemes are better
than others.&nbsp; </p>

<p>In my experimentation, I actually did a lot more schemes.&nbsp;
For example, instead of key frames every 10 versions, I also tried every 5, 15
and 20.&nbsp; However, all those extra data points really cluttered the graph.&nbsp; So I
only included the most important ones here.</p>

<ul style='margin-top:0in' type=disc>
 <li >In the lower right, we find "full".&nbsp; Very fast and very
     large.<br>
     <br>
 </li>
 <li >In the upper left, we find "chains".&nbsp; Very slow and very
     small.<br>
     <br>
 </li>
 <li >We can ignore any point which is both above AND to the
     right of any other point.&nbsp; The "1flower" point is the one where I made one
     big flower, using version 1 as the reference for every other version.&nbsp;
     This scheme ends up being useless since zlib is better in both ways that
     matter.<br>
     <br>
 </li>
 <li >All the other points represent possible tradeoffs which
     might be interesting, depending upon our priorities</li>
</ul>

<p>Intuitively, the schemes which are closer to the origin are
better.&nbsp; This graph makes it easy to see that "zlib" and "flowers" are probably
the two most interesting options I have discussed here.</p>

<p></p>
]]>
</description>
</item>

<item>
<title>Ten Quirky Issues with Cross-Platform Version Control</title>
<guid>http://software.ericsink.com/entries/quirky.html</guid>
<link>http://software.ericsink.com/entries/quirky.html</link>
<pubDate>Mon, 13 Apr 2009 13:16:27 CST</pubDate>
<description>
<![CDATA[
<p>A big chunk of the software industry today can mostly ignore
the issues of multiple platforms, for one of the following reasons:</p>

<ol style='margin-top:0in' start=1 type=1>
 <li >They only support Windows. &nbsp;It's got like 90% market
     share, so why not?<br>
     <br>
 </li>
 <li >They serve a web application and don't care what the end
     user is actually using as long as their browser works.</li>
</ol>

<p>But version control tools involve more cross-platform
concerns than most other kinds of software.&nbsp; Neither of the reasons above tends
to work very well.</p>

<ol style='margin-top:0in' start=1 type=1>
 <li >If a software team has 450 Windows users and 50 people on
     Mac or Unix, then a Windows-only solution just won't do.<br>
     <br>
 </li>
 <li >Since a primary task of a version control tool is to
     manage source code trees on the user's hard disk, a web application just
     won't do.</li>
</ol>

<p>So, even as most coders have moved on to a world where they
can remain blissfully ignorant of the problems of writing software for multiple
operating systems, those of us who create version control tools are still wrestling
with those problems.</p>

<p>And in fact, I claim that our challenges are tougher than most.&nbsp;
Version control users ask for the darndest things, especially in the big
enterprise companies.&nbsp; It's easy to believe that all you need is Windows, Mac,
Linux and maybe Solaris.&nbsp; Then you find out just how prevalent things like AIX
and HPUX are.&nbsp; Terms like "Irix" and "Win95" and "mainframe" get tossed around
until you're numb and nothing surprises you anymore.&nbsp; When somebody asks for a
port to an arcane platform, you roll your eyes and wonder is if it uses 8-bit
bytes or <a href="http://en.wikipedia.org/wiki/CDC_Cyber">not</a>.</p>

<p>Worse than that, version control vendors aren't just <i>porting</i>
to oddball operating systems.&nbsp; We actually have to make our software <i>interoperate</i>
across all those environments.</p>

<p>And that's where things start to get quirky.</p>

<ol style='margin-top:0in' start=1 type=1>
 <li >On a Linux system, create a file called "README".&nbsp; In the
     same directory, create a file called "readme".&nbsp; Check them both in.&nbsp; Now
     go to a Mac and check them both out.&nbsp; Since the Mac file system is
     [usually] case insensitive, something bad is going to happen.&nbsp; Same goes
     for Windows/NTFS.<br>
     <br>
 </li>
 <li >On a Mac, checkin a file called "PRN".&nbsp; Check it out on a
     Windows system.&nbsp; That file name is <a
     href="http://stackoverflow.com/questions/62771/how-check-if-given-string-is-legal-allowed-file-name-under-windows">not
     allowed</a> under Windows, for backward compatibility with MS-DOS..<br>
     <br>
 </li>
 <li >Under Linux, checkin a file with a name that ends in a
     dot.&nbsp; Check it out under Windows.&nbsp; The trailing dot is probably gone.&nbsp; Now
     check the file back in and go back to your Unix system.&nbsp; If your version
     control system handled this badly, you've probably got two copies of the
     file, one with the trailing dot, and one without.&nbsp; Same goes for a
     trailing space.<br>
     <br>
 </li>
 <li >On a Linux system, checkin a file with a path that is 261
     characters long.&nbsp; Check it out under Windows.&nbsp; This might work.&nbsp; It
     probably won't.&nbsp; It kind of depends on whether .NET is involved or not.&nbsp;
     There's a \\?\ trick to get around the limitations of the Win32 layer, but
     the .NET libraries don't use it. <br>
     <br>
 </li>
 <li >On a Mac, checkin a file that has a <a
     href="http://en.wikipedia.org/wiki/Resource_fork">resource fork</a> and
     some Finder info.&nbsp; Check it out on a Linux machine.&nbsp; What happens?&nbsp; Did
     stuff show up as <a
     href="http://en.wikipedia.org/wiki/Extended_file_attributes">xattrs</a>?&nbsp;
     Should it have?&nbsp; On that same Linux machine, make a change and check it
     back in.&nbsp; Then check it out on the Mac again.&nbsp; Is the Finder info still
     there?<br>
     <br>
 </li>
 <li >On a Linux machine, checkin a file with a colon in the
     name.&nbsp; Check it out on a Mac.&nbsp; Not sure what'll happen, but it probably
     won't be what you want.<br>
     <br>
 </li>
 <li >On a Windows machine, checkin a file with a name that
     begins with a dash.&nbsp; Now check it out under Linux and <a
     href="http://www.dwheeler.com/essays/fixing-unix-linux-filenames.html">try
     manipulating it</a> with command-line utilities.&nbsp; Apps will think the
     filename is a command-line option.&nbsp; <s>What if somebody creates a file
     named "-rf *" and a Linux user tries to rm it?&nbsp; </s><i>OK, bad example.&nbsp;
     The point remains:&nbsp; Filenames which begin with a dash may cause more
     problems on some platforms than on others.</i><br>
     <br>
 </li>
 <li >On a Linux machine, create a source code file and check it
     in.&nbsp; Check it out on Windows and open it with notepad.&nbsp; The <a
     href="http://en.wikipedia.org/wiki/Newline">line-endings</a> are Unix-standard
     LF, but Windows apps expect CRLF, so notepad shows the entire file as one
     line.&nbsp; Now open the same file under Visual Studio.&nbsp; The file looks fine
     now.&nbsp; Now edit a few lines in the middle of the file, check it back in,
     and check it out on Linux again.&nbsp; The lines you edited are messed up.<br>
     <br>
 </li>
 <li >On a recent Ubuntu Linux system, create a file called "Espaņol".&nbsp;
     Do the same thing on Mac OS X 10.5.&nbsp; These two files have the same name,
     but even though you are [probably] using the utf-8 encoding of Unicode on
     both systems, the bytes which represent that name do not match.&nbsp; On the
     Linux machine, the file name [probably] will be in NFC normalized form (Espa\u00f1ol).&nbsp;
     On the Mac, everything gets normalized to NFD (Espan\u0303ol).&nbsp; When you
     check these files in and start working with them, bad things will happen
     unless your version control tool <a
     href="http://svn.haxx.se/dev/archive-2008-03/0780.shtml">understands</a>
     what's going on and deals with it appropriately.<br>
     <br>
 </li>
 <li >On a Unix machine, checkin a symbolic link.&nbsp; Check it out
     on Windows.&nbsp; What happens?</li>
</ol>

<p>Like I said, things get quirky.</p>

<p></p>
]]>
</description>
</item>

<item>
<title>Mercurial, Subversion, and Wesley Snipes</title>
<guid>http://software.ericsink.com/entries/hg_denzel.html</guid>
<link>http://software.ericsink.com/entries/hg_denzel.html</link>
<pubDate>Mon, 06 Apr 2009 08:55:00 CST</pubDate>
<description>
<![CDATA[
<p>People keep asking me why I don't talk more about Mercurial
in this series of blog entries.&nbsp; There's a simple answer to that question:</p>

<p style='text-indent:.5in'>Mercurial isn't very interesting.</p>

<p>Wait, that didn't come out quite right.&nbsp; Let me try again:</p>

<p style='margin-left:.5in'>Git is Wesley Snipes.</p>

<p style='margin-left:.5in'>Mercurial is Denzel Washington</p>

<p>Hmm, that probably needs further explanation.&nbsp; First let me
give a little background.</p>

<p>I am the founder of a version control software company.&nbsp;
I've done lots of writing about the topic here on my blog.&nbsp; Currently I am in
the process of revising and expanding all those writings to turn them into a
book.&nbsp; </p>

<p>As part of that effort, I have undertaken an exploration of
the DVCS world.&nbsp; Several weeks ago I started writing one blog entry every week,
mostly focused on DVCS topics.&nbsp; In chronological order, here they are:</p>

<ul style='margin-top:0in' type=disc>
 <li >The <a href="http://software.ericsink.com/entries/git_index.html">one</a>
     where I gripe about Git's index</li>
 <li >The <a href="http://software.ericsink.com/entries/git_immutability.html">one</a>
     where I whine about the way Git allows developers to rearrange the DAG</li>
 <li >The <a href="http://software.ericsink.com/entries/dvcs_dag_1.html">one</a>
     where it looks like I am against DAG-based version control but I'm really
     not</li>
 <li >The <a href="http://software.ericsink.com/entries/dvcs_dag_2.html">one</a>
     where I fuss about DVCSes that try to act like centralized tools</li>
 <li >The <a href="http://software.ericsink.com/entries/dbts_fossil.html">one</a>
     where I complain that DVCSes have a lousy story when it comes to
     bug-tracking</li>
 <li >The <a href="http://software.ericsink.com/entries/merge_history.html">one</a>
     where I lament that I want to like Darcs but I can't</li>
 <li >The <a href="http://software.ericsink.com/entries/why_is_git_fast.html">one</a>
     where I speculate cluelessly about why Git is so fast</li>
</ul>

<p>Along the way, I've been spending some time getting hands-on
experience with these tools.&nbsp; I've been using Bazaar for several months.&nbsp; I
don't like it very much.&nbsp; I am currently in the process of switching to Git,
but I don't expect to like it very much either.</p>

<p>Why am I using these tools if I don't like them?&nbsp; Because I
want the experience.&nbsp; I don't want to write hearsay.&nbsp; I want to live with these
tools and see what I learn.</p>

<p>So why don't I write about Mercurial?&nbsp; Because I'm pretty
sure I would like it.</p>

<p>I chose Bazaar and Git for the experience.&nbsp; But if I were
choosing a DVCS as a regular user, I would choose Mercurial.&nbsp; I've used it
some, and found it to be incredibly pleasant.&nbsp; It seems like the DVCS that got
everything just about right.&nbsp; That's great if you're a user, but for a writer,
what's interesting about that?</p>

<p>Denzel Washington is a great actor.&nbsp; Other than that, he
lives a pretty normal life.&nbsp; What's interesting about that?</p>

<p>In contrast, Wesley Snipes gives the world lots of things to
write about.&nbsp; Tax evasion.&nbsp; Conviction for reckless driving.&nbsp; Martial arts.</p>

<p>People admire Denzel Washington.&nbsp; But Wesley Snipes is
simply more interesting.</p>

<p>I admire Mercurial.&nbsp; But Git is more interesting.&nbsp; Like
Snipes, Git is an odd juxtaposition of great power with some quirky flaws and
failings.</p>

<p>One more thing:</p>

<p>People also ask me why I don't write more about Subversion.&nbsp;
That's easy too:</p>

<p>Subversion is Morgan Freeman.</p>

<p></p>
]]>
</description>
</item>


</channel>
</rss>
