Monday, April 14, 2008
« Print Encyclopedias Join Dinosaurs - Par... | Main | KKK Quilts »
Print Encyclopedias Join Dinosaurs


Part 2

by Michael S. Hart
Founder, 1971
Project Gutenberg
Inventor of eBooks


In 1985 when Gary Kildall, IBM's first choice before Bill Gates
to design their PC's operating system a few years earlier, came
out with the first electronic encyclopedia, who would figure it
would be only a quarter of a century before print encyclopedias
faded from the limelight to join vinyl records and dinosaurs?

$999 would buy you an external Sony CD drive and Grolier's CD--
pretty much the same price as the paper encyclopedias, but with
the option of putting any number of CDs in the drive.

This was only a year after the famous "1984" Super Bowl ad that
ran only once and changed Super Bowl ads forever.

It was only a year after IBM offered the AT.

It was the beginnings of the second generation of big time home
computers and the truth is that very few people realized what a
huge difference this was all going to make.

I, personally, was working on putting Shakespeare online as far
as my biggest public plans for the future.

In private, I was making 50 foot printer cables for the U of I,
for $109, mostly because everyone had said it was impossible.


Note:

I should add here, for the benefit of one of the trolls I know,
that I predicted back in the early 1980's that the wide variety
of very expensive printer cables would be replaced by a narrow,
very narrow, variety costing only a fraction as much.

How did I know this?

I saw the same thing happen with hi-fi cables.

When my Dad bought our first hi-fi 55 years ago he had to hire,
literally hire, an expert to come out to our house and make the
cables that wired the various components together for $34 each.

This would be about $250 in today's money!!!

Yet you can buy most stereo [his were mono] cables today for an
average price of a couple dollars, though you can obvious get a
gold, silver, nickel or whatever cable for much more.  We could
only get plain wires and connectors back then.

So it was obvious to me that the stunning array of cables would
not continue, nor their stunning prices.



I was also charging $180/hour for computer consulting, as I was
one of the only people anyone knew of who did both hardware and
software consulting, which was a HUGE advantage, in an age when
most users could not tell you whether a problem was hardware or
software related.

It was a kind of "Golden Age" of computing, but expensive.

A full tilt IBM-AT or Mac system might have cost you $10,000 in
dollars that was worth over $20,000 today, according to several
of the "Inflation Calculators" available online today.

Can you imagine spending $20,000 on a computer today???

It is hard to spend $2,000 on a computer today, one hundreds of
times more powerful, with 10,000 times the drive storage, color
palettes with over 16 million colors, etc., etc., etc.

The average computer today sells for under $500.


Note:

I said hundreds of times faster, but. . . .

The first PCs ran at a few megahertz, today's computers are now
running a a few gigahertz, but you have to multiply that by the
increased "word" size of how many bits get run per hertz.  Most
of todays computers run 64 bits, while the early PCs ran 8 bits
and I seem to recall some that ran only 4 bits, but that was on
computers earlier than are talking about here.

So, if you have a chip running at 2.44 gigahertz, that is half,
literally by the clock, as fast as 1,000 times as fast as those
first IBM-PCs that ran at 4.88 megahertz.  However you have the
"word" size at 64 bits, which is 8 times as much so the overall
difference is 4 times as fast.  Warning:  not all programs will
run at 64 bits, so your mileage may vary; and the same is true,
sadly to say, for the multi-processor systems, not all programs
run perfectly on those multi-processor systems, so your mileage
will vary even more on those systems.

By the way, in researching this article I read a few articles a
decade or two old about how computer speed could not keep going
faster and faster, much as a mile runner who started 10 minutes
per mile, and got up to 8 minutes per mile after a while, later
6 minutes per mile, could obviously NOT be expected to then get
to 4 minutes and two minutes per mile.

Well, surprise, surprise, those pundits were wrong and the news
of today contains the first chips running at 1,000 times speeds
of the first IBM-PC in terms of clock rate, and which will have
at least 8 times the bit count per clock tick, when they arrive
over the counter in yet another impossible generation of CPUs.




Back To Encyclopedias


Once you the first CDROM encyclopedia was out, others followed,
prices got very competitive, and "bundles" of software appeared
with many computers offering the computer, hard drive and CDROM
drive, and all the other components, and a stack of CDs all for
less than the first $999 offer mentioned above.

I, myself, bought two such systems for about $700 each, with an
assortment of CDs including; the Groliers, Microsoft Bookshelf,
world and US atlases, a great multi-language dictionary, and an
assortment of other programs.

That's how much things changed in perhaps the next five years.

A few years ago I got the 2002 Britannica, brand new, alone, at
Best Buy for about $17, combined with some other purchase.

Only a few of the most hard-boiled conservative products like a
copy of the Oxford English Dictionary on disc still cost a real
amount of money, but with Merriam-Websters unabridged available
at $150 with CDROM, and the American Heritage for much less, or
any of a handful of other dictionaries, the pressure is on, and
Oxford may yet have to drop its prices to the normal range, and
who knows if their famous print edition, the size of the larger
print encyclopedias will then survive much longer.



A Little Un-Advertising


Most of the products mentioned above like to thrill you with an
assortment of huge numbers about how many thousands of articles
and millions of words they contain, but the truth is that their
contents are not really measured in that many million words.

Let's say a giant dictionary has 25,000 words and uses 100 word
average entries per word, for 25,000,000 characters.

25 megabytes.

Gee, that original CD from Gary Kildall could hold over 500 meg
without undue stress.

That would be 40 dictionaries of this size, plus space left for
various pieces of software to enhance the process.

Today's flash drives can give you 25 GIGABYTES for $100.

A thousand times as much storage, read/write, much faster.




Let's move on to the huge multi-volume behemoths.


Let's say you have a 25 volume extravaganza.

Let's even say each volume has 1,000 pages.

Let's say each page has 4,000 characters.

That's 4 million characters per volume for 100 million total.

100 megabytes.

Again, who worries about 100 megabyte files these days?

Anyone with broadband probably downloads files much larger many
times without being amazed at the result.



A Little More Un-Advertising


These reference sources try to pretend that you cannot download
their entire database simply because it's not feasible.

The truth is that millions of people download entire DVD images
every single day, each one of which is many times larger than a
copy of the online Britannica, or every word of any of the more
weighty reference products.

Obviously it will take more space if they include pictures.

National Geographic would probably be the most intense example,
yet they have packed their first 108 years into 36 CDs, at half
a gigabyte per CD that would be 18 gigabytes.

OK, that would take a week to download but only because graphic
files are bulky compare to text and because you are including a
whole century of output.

However, if you ran that download in the background it might be
only two weeks, and you would never notice the load.

Even more un-advertising would have to include that the US will
have to be admitted to have pretty lousy bandwidth and that the
20 or so countries with better average bandwidth would allow an
easier download than in the US, and for less cost per file.



Bottlenecking or Artificial Scarcity


The way the MBA Generation tries to keep the information flow a
staggeringly low amount is to create artificial bottlenecks for
the transmission of that flow.

Extending copyright is the most obvious one of these, because a
public domain piece of information can be put online in numbers
of ways so vast that bottlenecking becomes irrelevant.

The fact that the modern Britannica is sized on the same orders
of magnitude as the world famous 11th edition of a century ago,
is pretty much totally ignored in their publicity.

The fact that you could download the entire 1909 edition in the
time it takes to eat a sandwich, is something they ignore.



Here's The Truth, In Plain Numbers

Warning:  These Are Large Numbers
[You may want to stop here. . . .]


Let's take the largest unit of information, Library of Congress,
at least that's the largest in "normal" use these days.

Let's say the Library of Congress, or another of the other large
libraries at the top of the world's listings, has books numbered
in the range of about 30 million, plus or minus a few million.

Let's further say that each of these has 1 million characters.

This is being generous in the face of the fact that so many of a
collection such as this might be better described as pamphlets.

However we must also consider weighty tomes such as Shakespeare,
The Bible, or other similarly sized works at 5 million, or large
novels such as Moby Dick at 2 million.

So let's go with a million characters times ~32 million entries:

~32 trillion characters in The Library of Congress.

Remember, we are only considering the words, not the pictures.

Words compress very nicely in computer files.

You can get about 2.5 times as many words via ".zip" files or an
example of many other similar compression formats.

This means an $80 500 gigabyte drive could hold 1.25 terabytes.

This means 10 of these at $800 could hold 12.5 terabytes.

20 of these at $1600 would hold 25 terabytes.

25 of them at $2,000 will hold 31.25 terabytes, or every word in
the average one of the world's largest libraries.



The Last Bit, or Byte, of Un-Advertising


OK, the average computer goes for under $500.

For $2000 you can add enough hard drive to hold a LARGE library.

Total cost of hardware:  less than $2,500.

Then add in the cost of your high speed connection.

Calculate how many gigabytes per day you can download.


Warning:  some places limit you to one gig per day.

Some wireless connection to 1/3 gig per day. . . .

Be careful to ask about his before you sign. . . .


At 3 gigabytes per day you could download 1 terabyte per year.

And have 10% left over for entertainment value.


It would take you a decade to download a major great library.


At 6 gigabytes per day you could download 2 terabytes per year.

Now only half a decade.


At 25 gigabytes per day you are down to around a year. . . .


It's possible. . . .


And it will just get more possible every single year. . . .



Note:

At the same time I sent this out the first draft of this I had
a note in my email advising me of 1T drives for $119. . . .

Call it $120.

That cuts all the terabyte prices listed above by 25% and cuts
the space, cabling, heat, etc., by HALF!!!

Monday, April 14, 2008 11:52:29 AM (Eastern Daylight Time, UTC-04:00)    Disclaimer  |  Comments [0]  |  Related posts:
Presidential Campaign which is a word used in War
Palin has used personal yahoo e-mail accounts to conduct state business
Michell Obama on the Daily Show Part 1 and 2
Stevens Institute Cathching Up In Math and Science
About Money Lies and Liars
Final List of Who Voted for the Bail Out Bill