Supplying article header metadata in IngentaConnect's standard DTD
Ingenta's standard DTD for provision and distribution of header metadata is a modified version of the NLM DTD. We chose to use the NLM DTD because it is a widely-adopted industry standard which we hope should be readily suppliable and receivable by our partners. We chose to modify it because certain elements within it were not appropriate for use within the IngentaConnect context.
By adopting an industry standard, we minimise data conversion requirements at our end and yours; by modifying it (only slightly!), we ensure that it is fully able to provide quality data to us.
The NLM DTD
Our DTD is based on the NLM Journal Publishing (Blue) DTD:
- Get the NLM DTD files by FTP
- Read the introduction and the technical documentation.
- See some sample data in the NLM format.
Our Modifications
Our modifications are detailed below. (Read more about customising the NLM DTD.)
1. We allow <month> AND <season> at the same time in <pub-date> .
(Less restrictive than standard.)
- Standard %date-model
- Our %date-model:
<!ENTITY % date-model "(((day?, month, season?) | season), year, string-date?)">
2. We require <month> in <pub-date>
(More restrictive than standard.)
(See %date-model above.)
3. We have an extra tag: <string-date> inside <pub-date>
(Less restrictive than standard.)
(Note: this tag also exists in the NLM Archiving and Interchange (green) DTD).
(See %date-model above.)
4. We have extended the list of article types.
(Less restrictive than standard.)
The type or category of article is defined in the article-type attribute of the <article> element.
The standard list of permitted values contains most common article types.
We have also extended this list to include:
- dissertation
- rapid-communication
More permitted article types may be added in the future.
5. We only accept references in <nlm-citation> style. Ie, <citation> is not allowed.
(More restrictive than standard.)
What is the problem with <citation>?
Its content model includes PCDATA.
What's wrong with that?
We could validate references like this:
OK, what is the alternative for marking up references?
<nlm-citation>. Its content model is restricted to defined elements.
What is the actual modification?
- Standard %ref-model
- Our %ref-model:
<!ENTITY % ref-model "(label?, (nlm-citation | note)+ )" >
6. XML Encoding Requirements
- Please try to use UTF-8 encoding
- Always specify the encoding being used
- Try and use numeric - rather than named - entities (removes confusion over ambiguous characters)
For more information on UTF-8 encoding see the XML specification at http://www.w3.org/TR/xml/
The New Modules
- ingenta-models.ent This supplements the original journalpubcustom-models.ent. Specifically, overrides %ref-model and %date-model.
- ingenta-modules.ent This supplements the original journalpubcustom-models.ent. Specifically, specifies that ingenta-models.ent exists.
- ingenta-journalpublishing.dtd. The entry point for the dtd. This is replaces journalpublishing.dtd (it is an amended version of it).
Usage
The following publishers are now successfully using this format to supply their data to us:
- Pharmaceutical Press