JILT 1996 (3) - Robin Williamson
The Law Reports CD-ROM - A Case Study
Robin Williamson
Context Limited
[email protected]
Contents
- Abstract
- 1. Introduction
- 1.1 Background
- 1.2 The Council of Law Reporting
- 1.3 Context Limited
- 2. Case Law Databases
- 2.1 Case Reporting
- 2.2 Case Structure
- 2.3 Case Research
- 2.4 Case Law Databases
- 3. The Law Reports
- 3.1 The Printed Series
- 3.2 CD-ROM Tender
- 3.3 The Project
- 4. Data
- 4.1 Options
- 4.2 Data Format
- 4.3 Data Conversion
- 5. Software
- 5.1 Design Requirements
- 5.2 Usage Study
- 5.3 User Interface
- 6. Commercial Issues
- 6.1 The electronic Law Reports
- 6.2 Delivery
- 6.3 Pricing
- 6.4 Availability
- Download
The Law Reports, published since 1865 by the Council of Law Reporting provides the common law and interpretation of statute law for England and Wales. This paper describes the development by the Council and Context of the electronic Law Reports ( e LR ), a CD-ROM database of the series. 1.5 gigabytes of data are being captured in the Far East using a tagged ASCII structure developed by Context. A new version of Contexts JUSTIS software incorporating hypertext linking of case references and a rich text display has been designed with the participation of all branches of the legal profession.
Keywords: Case law databases, CD-ROM, Full text retrieval, Data capture, Tagging, Hypertext.
Date of publication : 30 September 1996
Citation: Williamson R (1996) 'The Law Reports CD-ROM - A Case Study', BILETA '96 Conference Proceedings, 1996 (3) The Journal of Information, Law and Technology (JILT). <http://elj.warwick.ac.uk/elj/jilt/bileta/1996/3william/>. New citation as at 1/1/04: <http://www2.warwick.ac.uk/fac/soc/law/elj/jilt/1996_3/special/williamson/>
1.1 Background
The Law Reports contain the reports of decisions of the superior courts of England and Wales which constitute binding precedents in English law, published continuously since 1865 by the Council of Law Reporting. This paper describes the development of the electronic Law Reports, a CD-ROM version of the series being developed jointly by the Council and Context Limited.
1.2 The Council of Law Reporting
The Incorporated Council of Law Reporting for England and Wales, a registered charity, was established by the legal profession in 1865 to provide authoritative reporting of case law. Council members are drawn from all branches of the legal profession. The Council publish case law in three series: The Law Reports, the Weekly Law Reports and the Industrial Cases Reports. The Council of Law Reporting also publish The Law Reports Index (the Red Book), a comprehensive Index to cases published by the Council and other sources.
1.3 Context Limited
Context, a British company, is the leading electronic publisher of United Kingdom and European Union law and regulatory databases on CD-ROM. Context launched the first CD-ROM of European Community law in 1989 with JUSTIS CELEX. In 1990 Context produced the first CD-ROM of UK legislation, the Statutory Instruments database co-published with HMSO. In a cooperative venture with the Council of Law Reporting, the first CD-ROM of English case law, the Weekly Law Reports, was launched in 1991, with the Councils Industrial Cases Reports following in 1994. Other case law series published by Context include Jordans Family Law.
2.1 Case Reporting
The English legal system, based on the principles of common law, relies as much on precedent as on statute to interpret and apply the law. The accurate and authoritative reporting of cases is thus fundamental to the proper functioning of the law.
The founding of the Council of Law Reporting in 1865 established new standards of quality and authority in the reporting of case law. The selection of cases to be reported, and the standards to be applied, are entrusted to the Editor, always a distinguished and experienced lawyer. Reports of cases have to be prepared by a qualified member of the bar. The format, typographical layout and quality of the printed law reports conform to rigorous standards. The Law Reports use a formal method of referring to and quoting from other cases and statutory materials.
2.2 Case Structure
A typical case report includes the following elements:
Title : the names of the parties, court and judges and dates of the hearing and judgment
Catchwords: subject matter headings on which is based The Law Reports Index, with sub- and sub-sub- headings, followed by a few phrases to describe the nature of the case, and any relevant statutory provisions
Headnote: a summary of the case, setting out essential facts on which the decision is based and the reason for the decision, with references to relevant authorities
Citation: tables of cases referred to in the judgments and cited in argument
Facts: an introduction to the case
Counsel: names of the advocates appearing in the case
Judgment: the text of the judgment as delivered
Order : the decision of the court
Solicitors: names of the solicitors instructed in the case
Footnotes: references raised in the report
Reporter: name or initials of barrister reporting the case.
A database designer will immediately recognise that a text source that has a consistent formal structure, including keywords ( catchwords ), an abstract ( headnote ) and cross references ( citation ), is ready-made for delivery as a structured full text database. In setting the conventions for case reporting in 1865 the founding fathers of the Council of Law Reporting had little idea that they were designing a full-text database, but they produced a format that has stood the test of time and has all the best elements of database structure.
2.3 Case Research
The nature of legal research lends itself to database usage. The researcher is typically looking for a case that deals with a particular point of law, or involves a particular judge, lawyer, named party, or relates to another case. Database search tools such as full text searching, applying Boolean logic to fielded data, and the use of hypertext links are all eminently suitable for satisfying the type of query the user will wish to make. And CD-ROM is the ideal medium for database delivery. Searching itself often needs experiment with search strategies, and the end result of a search is frequently a lengthy full text document. Browsing is a way of life for the legal researcher. All these factors add up to an advantage for off-line usage, where the inhibiting virtual ticking of the clock accumulating time-based charges is eliminated. And lawyers frequently do their research away from their office or chambers - on the train, at home, in court. A case law CD-ROM and a laptop is a powerful alternative to looking through dozens of books in the library.
2.4 Case Law Databases
Context at present publishes CD-ROM versions of two case law series from the Council of Law Reporting, the Weekly Law Reports and Industrial Cases Reports.
Weekly Law Reports (WLR) is a weekly series published in three volumes which report all Superior Court decisions of importance to the development of the law. Volume 1 covers the starred cases of lesser importance; Volumes 2 and 3 consist of those cases that are eventually published with the addition of legal argument in The Law Reports. WLR covers judgments of English courts including the House of Lords, Privy Council, Court of Appeal, High Court, Ecclesiastical Courts, Restrictive Practices Court. The JUSTIS Weekly Law Reports database contains all three volumes from 1971 to the present, as well as a database version of The Law Reports Index (the Red Book), comprising some 300 megabytes growing by around 12 megabytes per year.
Industrial Cases Reports (ICR) date back to the inauguration of the National Industrial Relations Court in 1972, and incorporate the Council of Law Reportings Restrictive Practices Reports. The name was changed from Industrial Courts Reports to Industrial Cases Reports with the establishment of the Employment Appeal Tribunal in 1975. The reports cover cases from the Employment Appeal Tribunal, the High Court, the Court of Appeal and the House of Lords as well as the Restrictive Practices Court, published monthly. The JUSTIS Industrial Cases CD-ROM contains the complete text of the Industrial Cases Reports from the commencement of the series in 1972, comprising 90 megabytes, growing by 4 megabytes per year.
Electronic text originated by the printers of the reports is used to maintain the two databases, but the bulk of the archival material exists in print form only. In total some 36 years of reports from the two series had to be captured from the printed versions. This work was undertaken by Context, using a data capture source in the Far East for the basic text conversion task
The case law databases are accessed with the JUSTIS full text retrieval system, consisting of a user interface developed by Context running under DOS or Windows, using the Ful/Text retrieval engine from Fulcrum Technologies Inc. JUSTIS is used for all Contexts CD-ROM and online databases, and features a seamless link between CD-ROM and online. This allows the CD-ROM user to move to Contexts online service by a single function key or click of the mouse. The online service contains updates to the CD-ROM data, so the CD-ROM user can access the latest information alongside archival data on the CD-ROM without changing interface or leaving the current search session. This ability to switch easily between different databases also applies to switching between other JUSTIS CD-ROMs.
3.1 The Printed Series
The Law Reports contain the reports of decisions of the superior courts of England and Wales which constitute binding precedent in English law and have been published continuously since 1865. Judicial precedents are "judge-made laws" and when acted upon become of equal force with statute law in establishing the law of the land in a court of law. Where a case is reported in The Law Reports that report must be cited in preference to any other series of reports. The Law Reports provide the common law and the interpretation of statute law. Cases reported in The Law Reports contain the elements published in the Weekly Law Reports, with the addition of argument. The Law Reports are published in monthly parts. The series now comprises some 750 volumes in total, amounting to some 480,000 pages, 200,000,000 words or 1.5 gigabytes of data.
The Law Reports are written by law reporters, who must be barristers, and who select the cases according to principles established in 1865. The reporters sit in court and take notes of legal argument. They check the references and citations in the transcript of the judgment, write the headnote (which must contain the salient facts of the case and the courts reasoning for its judgment), add the catchwords and prepare case lists and names of counsel and solicitors. Proofs are sent to judges and counsel for approval and are then seen by the Editor to ensure they comply with the standards set by The Law Reports.
A set of The Law Reports will be found in almost every barristers and judges chambers and in the library of the larger law firms. Every law school has a set, and reference to The Law Reports is an essential part of legal training. A new reprint of a complete set of The Law Reports from the present back to 1865 costs over £18,000, with a good second hand set selling for around £10,000 to £12,000.
A key feature of the printed version of The Law Reports is the typographical layout of the printed page. The content of the report to a great extent determines the typography. Figure 1 is a page from an early law report. It illustrates the following typographical features:
Case Name
Catchwords
Headnote
Figure 2 is a page from a more recent law report. The similarity of layout and typographical convention will be immediately evident. Of course, over time there have been many changes of detail. However, a law report from 1865 has the look and feel of a law report from 1995, and both are instantly recognisable as law reports by the professional reader.
3.2 CD-ROM Tender
The acceptance of the CD-ROM versions of its other series by the profession encouraged the Council of Law Reporting to consider the publication of the complete set of The Law Reports on CD-ROM. Because of the size of the project and its importance to the profession, the Editor drew up a formal set of requirements for the product, and in July 1994 the Council of Law Reporting invited a short list of electronic publishers to make proposals for developing the product in accordance with these requirements.
The Council of Law Reporting was concerned to maintain the look and feel of the printed page in any on-screen display, since this is an integral part of the way information concerning the content of the report is conveyed to the reader. Other key requirements included the following:
The database to comprise the complete series of The Law Reports from 1865 to the present
Search software to be based on the formal structure of a law report, and to incorporate links between case authorities referred to in the headnotes, both preceding cases as well as succeeding cases
Links for statutory provisions referred to in the catchwords or headnote to the full text as set out in the Judgment
Ability to select date ranges, case name, division, case reference, page number etc; also provision for full text index and Boolean searching
Appropriate document navigation, annotation, saving and printing functions.
Data capture of the complete printed archive of The Law Reports.
3.3 The Project
Contexts proposals were accepted, and the decision to proceed with the development was made by the Council in December 1994.
Under the agreement with the Council of Law Reporting Context is responsible for the design, implementation and project management of all technical aspects of the project including the management of the data capture contract. The Council of Law Reporting sets the quality standards required for the on-screen presentation of the text, and for the conformation of the user interface to the requirements of the legal profession.
The investment required to build the product and bring it to market is shared equally between Context and the Council of Law Reporting. Context is responsible for marketing the product and providing technical support and training. The Council and Context share the revenue.
4.1 Options
An important element in the proposed electronic Law Reports product is the ability of the front end software to display the documents in a rich format. Although screen displays are not expected to be facsimile images of the printed copy, they must be presented using typefaces and styles as close to the original as possible.
In preparing its submission to the Council of Law Reporting Context considered the options available for encoding The Law Reports data, summarised as follows:
ASCII: The ASCII format represents the text only, and does not interpret the structure of the work.
Tagged: Tags, or codes added to the ASCII format to indicate the presence of specific types of text, for example the start of the headnotes or catchwords fields in a law report. The set of tags added to the ASCII text are the minimum required by the design of the electronic product for which the data is to be used.
Generic: The full structure of a printed work captured electronically by using a standard generic form of tagging, such as SGML (Standard Generalised Markup Language) or RTF (Rich Text Format). Generic coding gives a richer set of structure tags than tagged ASCII, since it is the full structure of the printed work that is encoded rather than the subset needed solely for a particular electronic product. A generic format can be converted through appropriate software into any other format that has the same or lesser structural representation.
Hypertext: Cross references embedded in the text are identified and marked electronically to be exploited by the search software to give the user the ability to go directly to the referenced text. The hypertext links are identified by a combination of automatic processing and editorial input. Hypertext links can be added to a file that is in tagged ASCII or generic format. The automatic identification of hypertext links is sometimes made easier if a generic format is used, but automatic hypertext link identification can be equally effective where the tags in a tagged ASCII format are carefully chosen.
4.2 Data Format
In their submission to the Council of Law Reporting, Context recognised the case for holding the complete text of The Law Reports in a generic format such as SGML. However, the large volumes of text involved proved a key factor in influencing Contexts ultimate decision. Conversion from print to SGML needs the services of a dedicated SGML supplier. Alternatively a less specialised facility can be used. This choice involves providing the supplier with detailed instructions, training operators and establishing validation procedures to ensure a high level of quality. This is time consuming and very difficult to achieve. Either SGML option proved to be uneconomic for the volumes involved.
After analysis of the requirements of the project it became clear that the level of detail and structure offered by SGML or similar generic languages was excessive for the requirements defined for The Law Reports. A further important consideration was the fact that the chosen search engine did not support structured tagging languages.
Context concluded that a mark-up language using tagged ASCII could meet the data capture, presentation and performance requirements of The Law Reports, taking into consideration future developments and the advantages to be gained through flexibility.
Sample volumes from The Law Reports covering the whole series were studied in order to create a framework within which cases, pages and elements could be identified and labeled in a uniform manner. This was a complex process due to differences in styles and formats used throughout the last 130 years. The resulting framework was then used to design the mark-up language, named CCDF (Contexts compound document format).
The CCDF mark-up language had to be a true reflection of a design framework based on the sample used in the analysis, and had therefore to be flexible enough for fine tuning during the lifetime of the project. There were two further objectives:
the CCDF language (and the instructions for the keyboard operators) must be simple enough to ensure an acceptable level of accuracy
any information tagged in CCDF format must be convertible to other tagging systems with a minimum of manual intervention.
In total about 40 different tags are used in the CCDF system to mark-up The Law Reports documents In addition to the tagging system, CCDF includes formatting software that gives a typographical format for screen and print display and compresses data and tags before transfer to the CD-ROM.
CCDF provides sufficient structure coding required to present pages from The Law Reports on screen in a way that is familiar to the users of the printed volumes. The availability of translation tables allows Context to use the system to mark up other data sets and develop future products.
Viewer/Editor software has been designed and developed for checking the validity and accuracy of the marked-up documents. A split-window arrangement presents the operator with a tagged file in one window and the intended rich display of the same portion of text in another. The operator can review the final presentation and make any necessary changes without having to rely purely on syntax and memory.
4.3 Data Conversion
The data conversion process consists of three elements, conversion into machine readable text, addition of structure and layout tags, and insertion of cross-reference links.
Context already has experience in the conversion of large volumes of text into machine readable format, and the addition of structure tags to converted data. Over 14 years of the Weekly Law Reports and 22 years of Industrial Cases Reports have been converted for the current range of CD-ROMs. This task has been undertaken using a data conversion company in the Far East under direct contract with Context. Context has developed a close working relationship with this company, and has set quality standards and monitored these by sending a senior member of its technical department to spend time with the company. A contract to capture The Law Reports text to the specification drawn up by Context and agreed with the Council of Law Reporting was negotiated in March 1995. The work commenced in April 1995, and is scheduled to take 12 months to complete.
The data conversion activity goes through the following stages:
Tagging Instructions: Detailed instructions were produced at the start of the project, defining the way the CCDF tagging scheme is to be applied to the different types of case report layout and presentation, with examples.
Data Capture: The data capture contract requires the contractor to convert the text to a set accuracy level, and to insert the CCDF tags. Once the processing is completed diskettes with the processed data are returned to Context. The printed originals are retained until the electronic data is checked by Context. After any necessary corrections have been made the originals are returned to Context.
Batch checking: The results of the contractors processing are checked on receipt in the UK by Contexts own operators. Tagging is 100% checked, and text conversion accuracy is tested using formal statistical sampling techniques developed for the project. Additional hypertext tagging is then added.
Consolidation: Once all the data conversion work is completed the batches are consolidated into a single collection, and final integrity checks carried out. The data is then indexed and formatted before being transferred onto CD-ROM.
5.1 Design Requirements
Contexts current product range uses the companys own JUSTIS user interface, linked to the Ful/Text search engine licenced from Fulcrum Technology Inc. JUSTIS gives a consistent access system to a wide variety of databases, including full text products such as CELEX (the official legal database of the European Communities), United Kingdom Statutory Instruments, and the Weekly Law Reports and Industrial Cases Reports from the Council of Law Reporting. JUSTIS is also used for bibliographic and reference databases, most notably for the JUSTIS Parliament series. JUSTIS is available under DOS and Windows.
After studying the initial specification set out by the Council of Law Reporting, Context decided that the special and unique requirements of The Law Reports needed a new approach, rather than the application of the current JUSTIS software. In particular, the electronic Law Reports product requires a screen presentation and hard copy output that retains the distinctive typographic style of the original, while offering the user a natural but powerful method of exploiting the added value of the electronic format. The design developed by Context builds on the successful JUSTIS system. It is based on the same Ful/Text search engine, and has been developed as a Windows product from the start. The standard JUSTIS Windows software has strong similarities to the version developed for the electronic Law Reports, and the user of the Weekly Law Reports or Industrial Cases Reports (or indeed any of the JUSTIS product range) will be able to use the electronic Law Reports without having to learn a radically different search system, and vice versa.
5.2 Usage Study
At the start of the project it was realised that a database of the complete series of The Law Reports would present the legal profession with a radically new way of conducting its case law research. It was essential to consult the profession before making any final decisions about key user interface features.
Accordingly, the Council of Law Reporting invited all branches of the legal profession and the teachers of law to participate in a detailed study of the potential usage of the electronic Law Reports database. This invitation was enthusiastically accepted by representatives of the judiciary (including a Law Lord and two judges of the superior courts), the Bar Council (including four QCs), the Law Society and the Committee of Heads of University Law Schools. The study was carried out for the Council of Law Reporting by the Law Technology Centre at Warwick University during April and May 1995. At the conclusion of the study a seminar was held at the Law Society, attended by many of those who took part in the survey. The Editor of The Law Reports and representatives of Context took part in the seminar and discussed in detail the survey results with the researchers from the Law Technology Centre. The guidance received from the study participants has strongly influenced the final design of the user interface for the CD-ROM.
A beta version of the software planned for completion early in 1996 will be tested by a number of people who took part in the study, as well as other users representative of the target customer base for the product, providing a further level of user input to the design of the interface.
5.3 User Interface
The key features of the design of the product respond to the basic requirements set by the Council of Law Reporting, as follows:
The database to comprise The Law Reports from 1865 to the present : The product stores the database across several CD-ROMs. The full text index links the entire set of reports, so that when the CD-ROM set is presented to the search software on a juke box the database is treated as a single extended database. Alternatively, the entire database can be transferred to hard disk from the CD-ROMs. One of the discs is the update disc, released regularly with additional material. The update disc also contains updates to cross references that affect cases held on the archive discs.
Search software to be based on the formal structure of a law report, and to incorporate links between case authorities referred to in the headnotes, both preceding cases as well as succeeding cases: The field structure of the database reflects the main textual elements in a law report, essentially the field structure described in Section 2.2 above, with the addition of argument. Searches may be restricted to a single field or any combination of fields. The field structure is also exploited for navigation within a document, allowing the user to skip to any desired section of the document. Cases referred to in the headnote form automatic hypertext links. A Case Link feature also gives the user a list of all cases referred to in the selected case (preceding cases) as well as all cases that refer to the selected case (succeeding cases).
Links for statutory provisions referred to in the catchwords or headnote to the full text as set out in the Judgment: A Text Link feature allows the user to mark any text string and search for other instances of the text string anywhere in the database (limited if required to selected fields such as headnote or catchword). This facility can be used to select statutory provisions referred to in the case and find references in other cases.
Ability to select date ranges, case name, division, case reference, page number etc.; also provision for full text index and Boolean searching: Indices are maintained of the full text and of the contents of each field. The basic search tool offered by the interface is Boolean searching on the full text of the database, delimited where required by field name. Boolean searches can be constructed in free text format, or by completing a form that exploits the field structure. The logical structure of the series, categorising cases by court and by year and volume of the printed series can also be exploited in constructing a search strategy. The pagination of the original printed version of The Law Reports is maintained, allowing instant reference to a case by using the standard method of case referencing by page and volume number.
Appropriate document navigation, annotation, saving and printing functions: A full set of document navigation and output functions is provided, including the ability to annotate text, retrieve and edit annotations, print, cut, paste and save text extracts. Standard Windows facilities are utilised wherever relevant.
6.1 The electronic Law Reports
The database of The Law Reports has been named electronic Law Reports, abbreviated to e LR. The commercial launch of e LR took place in February 1996
6.2 Delivery
e LR will be delivered on a set of CD-ROM discs. One disc of the set will contain current data, and will be updated at least every six months. This update CD-ROM will also contain updates to Case Links applicable to archive data. Context will also offer online access to monthly additions to The Law Reports. The user will be able to transfer data from the CD-ROM set onto hard disk if required.
6.3 Pricing
The pricing structure for e LR is based on an initial fee for the archive, and an annual fee for updates. The archive fee is £10,000 for an organisational licence and £5,000 for an individual licence. The annual update fee is 10% of the archive fee. Surcharges apply for network usage. Contexts policy of offering academic discounts will be followed. Special terms are available for pre-launch orders and for individual subscribers to the printed Law Reports.
6.4 Availability
Release of e LR is expected in the summer of 1996. Demonstration versions will be shown at the BILETA conference in March 1996.
Figure Captions (Both Figures are referred to in Section 3.1 ):
Figure 1: Early Law Report (1890)
Figure 2: Modern Law Report (1983)
BAILII: Copyright Policy | Disclaimers | Privacy Policy | Feedback | Donate to BAILII
URL: http://www.bailii.org/uk/other/journals/JILT/1996/special_williamson_3.html