LISTSERV - MUSEUM-L Archives - HOME.EASE.LSOFT.COM

MUSEUM-L Archives

Museum discussion list

MUSEUM-L@HOME.EASE.LSOFT.COM

	LISTSERV Archives
	MUSEUM-L Home

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives

Options:	Use Forum View Use Monospaced Font Show Text Part by Default Show All Mail Headers
Message:	[<< First] [< Prev] [Next >] [Last >>]
Topic:	[<< First] [< Prev] [Next >] [Last >>]
Author:	[<< First] [< Prev] [Next >] [Last >>]

Subject:	Re: Formats in scanning images
From:	Mark Rosenstein <[log in to unmask]>
Reply To:	Museum discussion list <[log in to unmask]>
Date:	Sat, 12 Sep 1998 08:53:19 -0400
Content-Type:	text/plain
Parts/Attachments:	text/plain (93 lines)

   Date:         Sat, 12 Sep 1998 11:25:14 GMT
   From: "Emanuel Andrade C. Sancho" <[log in to unmask]>

   Would be interesting to know what formats are you using in your
   scanning or photographing works.
   The fact is that each format has his owns caracteristics, volume,
   etc. We are always interesting in the best quality but, for
   example, with my hardware a single scanned image with top quality
   has occupyed about 90 M memory on the disk (GIF), wich is amazing
   and completly out of question. A medium quality is about 20
   Megas. As everybody knows, JPG is excelent for ecran viewing and has
   high compression levels. Aparently is not so good for printing. BMP
   is very memory consuming, etc, etc, etc... Indeed, for us in
   museums, were such work is very time-consuming and we talk at
   least on thousands of images, is very important to take the right
   decision.
I think the first question is what is the purpose of the scan. Is it
for preservation, or access or a notation in a catalog? If it is for
preservation (i.e. the original will be destroyed after the scanning)
than I highly recommend, if at all possible, waiting till the issues
in preservation in digital media are better understood.

If it is for access, and the access will be via a browser on the
world wide web, than no matter what format you choose to store
the image in, you will eventually have to convert it to one of the
two formats web browsers currently understand: GIF or JPEG.

In saving an image, you typically want to compress it before saving.
Compression comes in two flavors: lossless and lossy. In lossless
compression, even though the file is smaller, all the information is
preserved from the original scan. In lossy compression, some of the
information in the original scan is thrown away, so that when you
uncompress the image it is not exactly what you had when you first
scanned it. Typically lossy compression generates small file sizes
than lossless compression, but not always.

JPEG is a lossy storage mechanism. In compressing an image to JPEG
there is typically a parameter you can specify, which indicates how
lossy the compression will be. The more lossy, typically, the smaller
the image size will be. GIF is a lossless compressed file format, but
it's color gamut is only 256 colors, so images with more than 256
colors will experience loss, as will now be explained.

A second characteristic of a file format is how the image is
represented.  For color images, your scanner will report how many
dots-per-inch are stored, as well as the number of bits-per-pixel,
which is the color depth or how many colors a given pixel can
be. Typically 24 bits-per-pixel are captured by a scanner. JPEG will
store all 24 bits (but due to the lossy nature of its compression, it
may not be exactly the same 24 bits as the original). GIF, at maximum,
can only store 8 bits per pixel, so you are limited to 256 colors in
your image. This tends to be fine for such things as icons, and black
and white items (where you have 256 shades of gray) but on full color
images can lead to noticable artifacts. So even though the compression
of GIF is lossless, if you have an image that contains more than 256
colors, there will be loss in color fidelity.

Not only do file formats have color constraints, so do the machines
visitors browsers are on. There are still many machines out there
that only have 256 colors, and so no matter what image format you
use, the browser will muddle the image to fit into 256 colors.

Due to the problems with both GIF and JPEG, a number of sites store
their images in TIFF format, which is lossless and full 24 bit color,
and then either on-the-fly or in the background create either GIF or
JPEG images from this original source.

There are other issues, such as gamma which goes to explain why
images displayed on macs, pcs, and unix boxes look different, which
is beyond this already too long note.

Since in scanning, it is the scanning that is expensive, a reasonable
suggestion is to store the original scan in TIFF format, which is a
larger file, but preserves the information, and then convert the TIFF
file to either GIF or JPEG. GIF gives good compression and no loss for
items with 256 or fewer colors, while JPEG is the better choice for
color photographs. It is worthwhile to be an experimentalist, since
despite the above theory, these recommendations are just rules of
thumb, and convert a sample of your images to both GIF and JPEG and
compare the quality and file size and then make your own choice.

A useful resource is "Real World Scanning and Halftones" by Blatner and
Roth, and also the work at Cornell University Library on scanning.
I know the Getty has done work in this area, as well as a fairly
huge literature from the Digital Library world.

There is active research going on in this area, so any recommendations
will likely change as these issues are better understood.

Hope this helps.

Mark.

ATOM RSS1 RSS2

HOME.EASE.LSOFT.COM