org.paneris.bibliomania
Data Loading Procedures

(document $Revision: 1.5 $)

Background

This document describes the steps needed to load data onto Bibliomania.com .

Loading data requires coordinated action by three teams, this document describes those actions.

Teams and Roles

The data is captured either manually or by OCR by the Text Preparation Team (TPT). The books are chosen, marked up, and checked by the Editorial Team (ET).

Procedure

Document analysis and data preparation plan

ET decide on the book, what elements of the particular publication are to be excluded (eg notes from commentators).

Book marked up by ET.

The book is marked up on each page, crossing out page numbers and other unused items.

Book despatched from UK via courier.

ET notify Bibliomania Data Board.

Book typed in chapters

Note that the headword search relies upon bold text being martked up with the strong tag, not the B tag.

Chapters validated to HTML DTD 4.0 Transitional or lower HTML DTD

We need to put this in place.

Key.txt and index.wm prepared

The txt file should contain two or three columns separated by tabs.

filename short chapter title long chapter title (optional)
0001.html Chapter 1 In the beginning

Book uploaded to server

Book/Author records added to db

Book Paginated/encached

Book viewed page by page on site, with links and styling visually checked by Laserwords.

Book indexed

Notify this board including size in megabytes

Book reviewed by Editorial Team

Payment authorised

HOWTO

Howto link to a book

 $bib.url($bib.book(1541))

Howto link to a chapter

 $bib.url($bib.book(1541).chapter(130))

Howto link to an Author

 $bib.url($db.AuthorTable.Object(238))

Howto link to a named anchor

 $bib.anchorURL($bib.book(1541).chapter(130), "p282_4")

Note that when linking to a named anchor in a later chapter the enchache step needs to be run twice, as initially the named anchor cannot be found. (See 27046

Howto use TeX to keep a paragraph together

You can insert ANY raw TeX in files between <SPAN CLASS=tex><SPAN> tags. So to keep a paragraph intact use:

<SPAN CLASS=tex>\nobreak<SPAN>

Howto insert a footnote


<span class="footnote" number="1">Footnote text</span>

The numbers should be sequentially numbered within a chapter and should not exceed 50 in a chapter.

The first thing in a chapter should not be a footnote. If you really need to place a footnote as the first thing then add an empty paragraph before the footnote.

Howto paginate in batch mode

        bibimport <operation> <recurse> <table> <id | _> [flushafter|flushduring] >errors 2>wmerrors

where

        operation = number
        recurse | norecurse | skipChaps
        table = section, author, book, or chapter
        id = id number of sec/auth/book/chap in question
             (_ = all)
        flush = flag to determine when to flush memory

The operation number is calculated by adding up a combination of the operation numbers below.

encache 1
index 2
paginate 4
keydottxt 8

For instance,

        bibimport 1 recurse section 1
        bibimport 15 recurse book 522 flushduring

Known Problems


About this document

Authors

Tim Pizey <timp@paneris.org>

Most recent CVS $Author: timp $ @paneris.org

Readership and purpose

History

The important points in the life of this document are listed below (for detailed change history consult its CVS log.

The CVS log for this document is:


$Log: DataLoadingProcedures.html,v $
Revision 1.5 2003/11/18 20:09:18 timp
Add note about all records in bibimport

Revision 1.4 2001/07/28 14:41:06 timp
Tidy up

Revision 1.3 2001/07/27 02:50:00 timp
Sort out messagebURLs

Revision 1.2 2001/07/27 02:11:24 timp
Add notes from messageboards

Revision 1.1 2001/06/11 23:41:34 timp
Data Loading procedures

Revision 1.0 2001/03/08 16:47:46 TimP
First version