This technical whitepaper briefly describes the motiovation and someof the mechanisms behind the ECIX project's standard for "electronic databooks."References
OverviewThe goal of the ECIX project is to define an information interchangestandard for the electronic exchange of component data. The standard, calledthe Pinnacles Component Information Standard (PCIS), was conceived by thePinnacles Group, a consortium of electronic component manufacturers consistingof Hitachi, Intel, National Semiconductor, Philips Semiconductors, and TexasInstruments. The Pinnacles Group has now become part of the CFI, as the Electronic Component Information Exchange Project. New members are joining to participate in itsfurther development of the ECIX standard.
The Problem Imagine a design engineer searching for a part to design into a new system.The designer has a set of requirements and a stack of databooks. Findingthe required part means flipping through the books, determining each company'sscheme for representing the data, and trusting that the books aren't toofar out of date. As an engineer, the designer is bound to wonder if thereisn't a better way. This question led the ECIX project to create the concept of theelectronic databook.
But an electronic databook doesn't only address the designer's problems. For example,a publications group fighting last-minute revisions heroically finishesthe documentation just as the product is ready to ship, only to realizethat printing, binding, and distribution will cause the actual printed databookto be six weeks late. Or, two companies embarking on a joint developmentproject may discover that their documentation systems have no common denominator.
Finally, imagine the irony of one company using CAD and automated test systemsto compile data, then assembling the data with desktop computers, only toprint it on paper. Another company takes this paper databook and entersthe data by hand, using expensive engineering resources, into another CADsystem. The paper is necessary because it is the only accepted interchangeformat. An electronic databook could provide a more useful standard.
What is an Electronic Databook? An electronic databook is a comprehensive set of information aboutelectric components (including active and passive components, materials,and connectors). An electronic databook can contain more than just the text and graphicsthat are typically printed and distributed by vendors. It can also includecrucial design information in a structured and machine-sensible format.It can incorporate computer-sensible data such as CAD files, behavioraland functional models, audio, and video. In short, an electronic databook includes all ofthe data that a company wishes to provide to facilitate the design-in andsupport of a component.
An electronic databook can be used in many ways:
- Manufacturers operating under joint development agreements can exchangeelectronic databooks in their native form.
- The electronic databook can drive page composition software to producetraditional printed databooks.
- The electronic databook can drive desktop publishing software to producetraditional printed databooks.
- The electronic databook can be used to produce documents for the WorldWide Web.
- Electronic databooks can be distributed electronically by Internet,LAN, CD-ROM, etc., saving the costs associated with paper distribution.
- Browsers and search engines can take advantage of the standard to searchfor components across manufacturers.
- CAD systems can use the structured data and models directly, withoutrequiring manual input.
The electronic databook becomes the hub for data exchange.
How an Electronic Databook is Used The Pinnacles Component Information Standard (PCIS) defines an interchangeformat for electronic databooks, not a document development format, a searchor browsing format, or a presentation format. Furthermore, it is not a goalof the ECIX project to create these formats. However, enough informationhas been included in the interchange format to use these documents for development,browsing, and/or presentation, or to filter the PCIS document to a formatthat is better suited for these purposes.
The PCIS standard design has been driven primarily by the nature of thecomponent information; such considerations as data volume and ease of implementationwere only secondary considerations. PCIS is probably not the best formatfor any one of the possible purposes, but it is the only format that makesall of them possible.
AuthoringThe PCIS standard is descriptive rather than prescriptive; that is, it doesnot specify what elements must be included and in what order. Most companieswill want to restrict authors of electronic databooks in some way, both to make documentsmore standard and to make authoring easier by presenting fewer choices.Companies may design several stricter variations of the standard, one foreach type of book. For example, the authoring version for a microprocessordatasheet may require an instruction set division, but this particular divisionmay be prohibited in a memory datasheet.
Companies may also want to add information to their internal versions. Oneexample of this would be a company's internal meta-data, such as the author'sname and job tracking. This information would be stripped out for conversionto the PCIS-compliant interchange format.
PCIS can be authored on any SGML-compliant editor (or even "hand-tagged"using a text editor), but not easily. User companies may want to use Pinnacles-compliantauthoring tools that are being marketed by third-party tool vendors, andsome are even developing their own internal editing tools. Tool developmentis not part of the Pinnacles effort, however.
PrintingOne of the original goals of the ECIX project was to facilitate the printingof electronic databooks. This goal soon became to facilitate display,whether in print or on-line. Printed datasheets are still an important outputfrom electronic databooks, however.
Documents can be printed directly from SGML using a format specificationsuch as a FOSI (Format Output Specification Instance). However, it is alsopossible to filter the SGML to a word processing format, such as RTF orMIF, and to print from the word processor.
Browsing and Searching A major advantage of using a standard format for distributing electronicdatabooks is the low cost and high speed of electronic distribution. PCIS-compliantdocuments can be viewed and downloaded over the Internet or from CD-ROMsusing commercially available SGML browsers. They can also be translatedto other electronic formats. For example, converting SGML to HTML for theWorld Wide Web is very simple.
But the real utility of a component information standard comes from intelligentsearching using the named divisions, information classes, structured data,and dictionary features of Pinnacles. Take, for example, a user searchingfor information about an SRAM memory with a rise time in the 20 ns range.If this user had only unstructured text, he or she could search for thephrases "SRAM memory" and "rise time." But perhaps thedatasheet is titled "Static RAM," and the parameter is named "rise/falltime." In this case, the search will fail. Furthermore, the terms "Risetime," "20" and "ns" might be in separate columnsin a table, or worse, the value might be "19."
With a Pinnacles electronic databook, the user could use the Pinnacles dictionaryto ensure that the search terms would be standard. Then, the user couldsearch the Characteristics Source structure for a characteristic with theparameter name "rise time" and a value anywhere in the range from15 to 25 with the units "ns."
Of course, the software to perform this search needs to be provided, eitherby the companies providing the electronic databooks, the companies usingthe electronic databooks, or by third-party vendors. (Some are now beingdeveloped.) This kind of search is only possible with structured data, andmultiple-vendor searches are only possible using a standard format likePinnacles.
Project Structure The ECIX project is now activelty working on two standards: the Pinnacles Component InformationStandard (PCIS), and the Component Information Dictionary Standard (CIDS). PCIS defines the structure of an electronic databook, while CIDS definesthe terms used in an electronic databook and proposes a method for maintaining this dictionary.
The Pinnacles Component Information StandardFor PCIS, the ECIX project has created and is continuing to enhance its standardfor defining the structure and naming the parts of an electronic databook. PCIS includesthe following features:
- Named as well as generic divisions.
- Specialized database-like structures for parametric data, and a mechanismfor reflecting this data into the document's body.
- A mechanism for the inclusion of models and modeling data.
- Structures for meta-data that relates to the document (such as literaturenumber, revision history, company name, and address).
Named Divisions Like many documents, databooks are divided into divisions that may havetitles. In most documents, a division is identified solely by its title.Although PCIS allows this type of generic division, it provides named divisionsas well. These divisions may have any title, but the user can always locatethe information by referencing the division tag.
These divisions may have any titles, but they can always be identified bytheir tag. For example, to search simple text datasheets from several companiesfor pin-out information, you would need to search for titles such as "PinDescription," "Pin Function," "Pinning," or "PinOut" (with the alternate spellings "pinout" and pin-out").In a Pinnacles electronic databook, however, you could find the division tagged "PinOut Information," regardless of its title.
Named divisions include:
Supplementing named divisions are the Information Classes. The six classesof information identified in electronic component documents are:
- Absolute Maximum
- Architectural Functional Description
- Features Summary
- Instruction Set Information
- Memory Map
- Package Information
- Pin Out Information
- Product Characterization Information
- Register Sets
- Soldering and Mounting
Any division of a PCIS document may be identified as containing informationof a particular class. This allows users to search for information by twoindependent criteria: for example, Reliability Data in an Absolute Maximumdivision, or Safety and Environmental Compatibility Information in a Solderingand Mounting division.
- Product Summary Information
- Detailed Product Specification Information
- Application Information
- Safety and Environmental Compatibility Information
- Information Concerning Support Tools
- Reliability Data
Structured Data Perhaps the most important part of a component datasheet is the parametricinformation it contains, such as the operating voltage, timing characteristics,and pin-out. Most companies supply the same type of information, but theysupply it in very different formats. This information is encoded in human-readableformat--usually tables and figures--yet the user often needs it in somemachine-readable format. For example, a datasheet user might need to feedthe information into a CAD tool. Or the user might want to place the informationin a database for intelligent searches.
Although each company presents this parametric data differently, analysisshowed that it could be described using a common format. This common formatis the machine-readable Source. However, the ECIX project didnot want to prevent any company from presenting the data in its chosenformat. PCIS provides the Reflection mechanism for presenting datafrom the Source in any unstructured, human-readable way.
SourceCharacteristics are one data structure found in the Source. A characteristicis defined as the value of a parameter at specified conditions. For eachcharacteristic, the Characteristics Source contains the hierarchical structure.For example, a characteristic includes a parameter, which in turn may includea parameter symbol, parameter name, and parameter description. The CharacteristicsSource includes full description of all characteristics, including informationthat is occasionally but not consistently displayed (such as ConnectionIdentifiers, which show the pins or signals to which the characteristic applies).In addition, the Characteristics Source can identify the type of informationfor each characteristic (such as electric, magnetic, mechanical, thermal,etc.), the products to which the characteristic applies, and an indicationof whether the characteristic concerns reliability.
ReflectionsThe Reflection mechanism provides a way to present data from theSource in the body of the document "by reference." A Reflectiondoes not contain the data--it contains pointers to the data in theSource. This way, if the information in the Source is updated, the referencesin the document text will be automatically changed, too. Similarly, if acharacteristic's value is used four times in a datasheet, there is no needto find and change all four occurrences; only the value in the CharacteristicsSource needs to be changed.
Reflections are used in the body of the document to display informationthat is stored in the Source. To the end user, the reflected informationwill look as if it were replicated at the reflected location, but the datais not actually duplicated. The Reflection mechanism allows a piece of datathat is stored in the Source to be instantiated in multiple locations.
The body of the document contains text, lists, and tables, just as documentsdo now. Any of these structures may contain reflections of information fromthe source. The most common, and the most complex, place that reflectionsoccur is in characteristics tables. PCIS uses the an existing standard modelfor tables, which defines rows, columns, and entries. In order to createa characteristic table (for example, the AC Characteristics Table) the authorwould do the following:
All text structures may include Reflections, including but not limited toTitles, Paragraphs, List Items, and Definitions.
- Create an ordinary structural table (that is, a table with the appropriatenumber of rows, columns, column widths, etc.).
- Insert all non-reflected text such as column headings into the tablestructure.
- Reflect the desired Sources into the cells of the table, such as ParameterSymbols, Characteristic Values, Conditions, and Characteristics Group Titles.
Throughout the body of the document, wherever information that is in theSource should appear, a Reflection of that element from the Sources willappear. Reflections will be maintained as Reflection elements, and not treatedas simple copies of the information in the Sources. When displayed to theuser, the contents of the part of the Sources that has been reflected willappear to be at the location of the Reflection in the document.
For example, the Characteristic Source might contain:
<cc id="54334" pid="F234 static="0" reliability="0" acoustic="0" chemical="0" electric="1" info="0" magnetic="0" mechan="1" optical="0" thermal="0"> <parm id="54335"> <parm.symbol id="54336" >X<sub>YZ</sub> </parm.symbol> </parm> <cc.value id="54337" value.type="MIN"> <number id="54338" >12</number> <order.of.mag id="54339" >M</order.of.mag> <unit id="54340" >Hz</unit> </cc.value> <cc.value id="54341" value.type="MAX"> <number id="54342" >43</number> <order.of.mag id="54343" >M </order.of.mag> <unit id="54344" >Hz</unit> </cc.value> </cc>
Then, a paragraph of SGML text might be:
The unit runs at a minimum of <reflection refid="54338"> <reflection refid="54339"><reflection refid="54340"> and is reliable up to <reflection refid="54342"> <reflection refid="54343"><reflection refid="54344">.
Which could be displayed as:
The unit runs at a minimum of 12 MHz and is reliable up to 43MHz.
Part Numbers It must always be possible to identify a product unambiguously, becausemany pieces of information in a datasheet concern a particular product orgroup of products. The common phrase for this identification is "partnumber." Unfortunately, "part number" is not specific enoughto describe the different levels of product identification that are usedin current documents. A datasheet is considered to be about a product, butthe datasheet-level product ID is rarely, if ever, the number used to purchasethe product. Instead, a more detailed ordering number must be used to specifyspeed, package, etc.
Product Identification Elements The Pinnacles standard defines three levels of product identification:
The full Specific Product ID is rarely used inside a datasheet; it is morecommon to use a fragment including a wildcard, such as using "the 94xxprocessor" to mean "the 9419, 9420, and 9430 processors."There is no way to tell which of the three levels is intended by lookingat an ID. The Specific Product ID and Generic Product ID for any given documentmay be the same. Theoretically, all three could be the same.
- Specific Product ID: Identifies a physical product that can be purchased,including all the option codes a customer would need to order a product.
- Product ID fragment: Identifies products that are related in some manner,such as packaging or temperature characteristics. A Product ID Fragmentidentifies a useful set of products for some purpose, such as discussionin a characteristics table.
- Generic Product ID: Identifies a specific product or collection of specificproducts at the level of specificity described in a single document (suchas a single datasheet, application note, or databook).
In addition, there is sometimes an Alternate Product ID, which is an aliasfor the Specific Product ID. An alias may be used for an internal warehouseor inventory number, a customer's name for a product, or the informal or"street" name for a product.
Many levels of information in a PCIS document can be associated with a ProductID. The whole document, a division, a paragraph, even a warning can be associatedwith a product or group of products, using any level of product ID. By inference,if a structure is not specifically associated with a product ID, it is assumedto be about the product ID of the element in which it is nested. In verysimple documents, the only product IDs may be for the entire document. However,in many documents, parts of the document discuss one product ID or set ofproduct options, while other parts of the document discuss different ones.
The product ID associated with an element may reside at any level; thatis, it may be a Generic Product ID, a Product ID Fragment, or a SpecificProduct ID.
Models and Modeling Data In the electronic component industry, computer-sensible data associatedwith a component (such as simulation models) are rarely included in printdatasheets, but are often supplied as a separate piece of data, often bya third-party vendor. The preliminary PCIS decision was that modeling informationcould either be encapsulated in an SGML file or stored as an external filethat could be referenced from within the SGML file. In either case, themodel would be stored in its native format, not in SGML. PCIS does not tryto implement a new modeling language--it allows existing standards to beused.
Meta-dataThere are two types of meta-data associated with databooks: general meta-dataand company-specific meta-data. General meta-data comprises informationabout the databook or datasheet such as the document title, publicationdate, and what parts of the information have been changed. Company-specificmeta-data may include information on who wrote or changed portions of theinformation, what has been reviewed and by whom, and which process, machine,or corporate division produced various pieces of information. General meta-datawas declared to be within the PCIS scope, and company-specific meta-datato be outside its scope.
Why Was SGML Used? The ECIX project decided to develop the application standard using theStandard Generalized Markup Language (SGML). SGML, defined by the internationalstandard ISO 8879 in 1986, is a widely accepted and supported standard forthe encoding of structured information. SGML defines the rangeof valid structures for each type of document using a Document Type Definition,or DTD.
The ECIX project selected SGML as its document modeling and interchangeformat for the following reasons:
- It is vendor and platform independent.
- It allows data expressed in other industry standards to be includedand referenced.
- It is supported by commercially available tools and services.
- It ensures the longevity of the electronic component data.
- It supports the current publishing processes (such as paper and CD-ROM).
Component Information Dictionary Standard The purpose of the Component Information Dictionary Standard is to provide a computer-sensible, standard terms dictionary so thatauthors and users of component information have a common and unambiguousunderstanding of the meaning of that term.
The component dictionary consists of five sub-dictionaries:
These sub-dictionaries have been identified to provide a structure for thedifferent types of terms defined. From the user's perspective, there isonly a single dictionary.
- A quantitative dictionary
- A non-quantitative dictionary
- A technology dictionary
- A representation dictionary
- A general terms dictionary
The component dictionary provides for many different ways to establishcontext. All words that are used for computer-sensible purposes are calledterms. The set of all possible terms is properly called the vocabularyas contained in the component dictionary, which in turn has thefive proper sub-dictionaries. These five sub-dictionaries are the firstpossible context for a term. The taxonomy that was chosen comes from thedifferent uses of terms within component information documents and computersystems. The context establishes the framework for understanding.
Within each dictionary, there is a change control methodology that the ECIX project has defined for electronic databooks. Change control attributescan provide another possible context.
After the five dictionary classifications, the more refined context is capturedwithin the term's definition. Within a particular dictionary, the definitionof a term can have several possible elements to clarify the definition'scontext. For instance, if the term is quantitative--that is, something tobe measured--then units describing the measurement should be within thedefinition. The Pinnacles component dictionary specifies the main contentelements for any definition in the five different dictionaries.
Since CIDS is a closed set--that is, all terms used within the dictionarymust be defined in the dictionary--complex signature structures can be madeby assembling different terms. For instance, a TTL rise time could be expressedsomething like "rise_time.TTL" to form a complex signature. Sincerise_time is quantitative, within the CIDS Quantitative dictionary, thedictionary term rise_time with a term definition should be foundfor TTL technology and units type of seconds. Also, the dictionary term"TTL" would be defined within the CIDS Technology dictionary.
The approach that CIDS uses for a dictionary--that is, one term may havemultiple definitions--allows a wide variety of uses, such as for authoringelectronic databooks, creating and querying databases, designing applicationtools, etc. For instance, in a database, a data dictionary may be constructedwith an index number for a particular term and definition, in which casethe index number only needs to reference the complex signature. Conversely,within CIDS, the particular term definition can make reference to the externaldata dictionary index number. This is the approach CIDS uses with the IECdictionary, where for example "authorInternalID" could have thevalue of "AAE976-005." For the IEEE dictionary, the term definitionwould contain an authorization element that has the content "IEEE."In this case, to be a closed set, the term "IEEE" would need tobe defined in the General dictionary.
StrategiesThe current intent of the ECIX project for CIDS is to include at least allterms used in databooks. It should also allow multiple definitions for thesame CIDS term with identification of the source of the definition. Thismeans that multiple definitions can be stated and that market forces overtime will dictate which CIDS term will be the preferred or de factostandard. However, the CIDS dictionary will initially acknowledge the termsused in the IEC-1360-4 draft standard as the preferred standard.
In 1996, the ECIX project needs to consider a mechanism for overloadingdictionary terms. In concept, an overloading mechanism would allow the enduser to designate a standard dictionary (such as IEC-1360-4) to be the defaultdatabase. Then the user would be allowed to specify one or more additionaldictionaries that are to be searched for terms that have the same name butdifferent definitions. The user would specify the order that dictionariesare searched, and definitions in the last dictionary on the list would overloadthose in the preceding dictionary. This strategy would allow product-specificdictionaries to be used along with industry-standard dictionaries. Redundanciesbetween different dictionaries would then be unnecessary.
Also, to accommodate the need for innovation in the marketplace, a mechanismneeds to be developed for adding new, proprietary dictionary terms. Vendorsneed the capability to develop new terms or definitions in the context ofnew technology innovations. At first, these dictionary terms would be shippedin the vendor's database along with the rest of their component information.Later, the vendor would be able to submit these terms to the CIDS WorkingGroup for addition to the standard dictionary as part of the regular dictionarymaintenance cycle.
SGML document type definitions (DTDs) as well as an Express informationmodel also need to be developed.
TacticsThe dictionary needs to support the Pinnacles Component Information Standard(PCIS). This means that the dictionary structure needs to be stated in anSGML format, and a set of SGML DTDs needs to be defined. PCIS tags shouldbe reused whenever available to make for a smooth interaction.
In addition to providing such an architecture, the ECIX project willprovide the actual content of a dictionary. Furthermore, the ECIX project will also provide a procedure and recommend an owner for ongoing updatingand maintenance of the dictionary standard.
For the dictionary to be used, it must be available to the authors duringthe component information creation process. The tactical implementationmust be sensitive to the capabilities and limitations of available authoringtools.
The ECIX project will therefore develop a Component Information DictionaryStandard (CIDS). It will be based on SGML and contain Elements, Attributes,and SGML content models (data models), similar to the PCIS architecturestandard. The model will be tested with terms from the IEC 1360-4 reference collection.The Dictionary WG will follow established processes to submit this architecturestandard to be approved as a international standard.
Once CIDS has been approved, the ECIX project will evaluate and recommendmethods for filling the dictionary with actual content. Possible methodsare to use a Request For Technology process and/or to contract the workto a third party. The ECIX project will follow established processesto submit any content-filled dictionary for approval.
The ECIX project will also define requirements for the ongoing processof updating and maintaining the dictionary. The Dictionary WG does not intendto become the body for this long-term maintenance effort.
ReferencesFor more information on SGML, see the following references:
- Goldfarb, Charles. The SGML Handbook. NewYork, Oxford University Press, 1990.
- Graphic Communications Association, 1730 North Lynn Street, Suite 604,Arlington, Virginia 22209-2085, USA.
- SoftQuad, Inc. The SGML Primer. SoftQuad's Quick ReferenceGuide to the Essentials of the Standard: The SGML Needed for Reading a DTDand Marked-up Documents and Discussing Them Reasonably.Toronto, SoftQuad, Inc. 1991.
- Travis, Brian & Waldt, Dale. The SGML Implementation Guide.Springer-Verlag, 1995.
- Van Herwijnen, Eric. Practical SGML. KluwerAcademic Publishers, 1990 (revised edition, 1994).
- ArborText, Inc. SGML: Getting Started
For more information, contact: email@example.com