I'm used to authoring and serving HTML. Can I lear

Yes, very easily, but at the moment there is still a need for tutorials, simpler tools, and more examples of XML documents. Well-formed XML documents may look similar to HTML except for some small but very important points of syntax. The big practical difference is that XML has to stick to the rules. HTML browsers let you serve them broken or corrupt HTML because they don't do a formal parse but elide all the broken bits instead. With XML your files have to be correct or they simply won't work at all. One outstanding problem is that some browsers claiming XML conformance are also broken. Try yours on the test file at http://www.ucc.ie/test.xml. <淘宝热门商品:

小小豆叮

Do I have to switch from SGML or HTML to XML?

No, existing SGML and HTML applications software will continue to work with existing files. But as with any enhanced facility, if you want to view or download and use XML files, you will need to use XML-aware software. There is much more being developed for XML than there ever was for SGML, so a lot of users are moving. <淘宝热门商品:

小小豆叮

IBM Developerworks的xml专区

<淘宝热门商品:

小小豆叮

What else has changed between SGML and XML?

The principal changes are in what you can do in writing a Document Type Definition (DTD). To simplify the syntax and make it easier to write processing software, a large number of SGML markup declaration options have been suppressed (see the list of omitted features). An extra Name Start Character is permitted in XML Names (the colon) for use with namespaces (enabling DTDs to distinguish element source, ownership, or application). A colon may only appear in mid-name, not at the start or the end. <淘宝热门商品:

小小豆叮

一个值得推荐的XML技术网站

<淘宝热门商品:

小小豆叮

W3C、XML的标准词汇资料

<淘宝热门商品:

小小豆叮

Who is responsible for XML?

XML is a project of the World Wide Web Consortium (W3C), and the development of the specification is being supervised by their XML Working Group. A Special Interest Group of co-opted contributors and experts from various fields contributed comments and reviews by email. XML is a public format: it is not a proprietary development of any company. The v1.0 specification was accepted by the W3C as Recommendation on Feb 10, 1998. <淘宝热门商品:

小小豆叮

Do I have to change any of my server software to..

Do I have to change any of my server software to work with XML? The only changes needed are to make sure your server serves up .xml, .css, .dtd, .xsl, and whatever other file types you will use as the correct MIME content (media) types. The details of the settings are specified in RFC 3023. Most new versions of Web server software come preset. All that is needed is to edit the mime-types file (or its equivalent: as a server operator you already know where to do this) and add or edit the relevant lines for the right media types. In some servers (eg Apache), individual content providers or directory owners may also be able to change the MIME types for specific file types from within their own directories by using directives in a .htaccess file. The media types required are: text/xml for XML documents which are `readable by casual users' ; application/xml for XML documents which are `unreadable by casual users' ; text/xml-external-parsed-entity for external parsed entities such as document fragments (eg separate chapters which make up a book) subject to the readability distinction of text/xml; application/xml-external-parsed-entity for external parsed entities subject to the readability distinction of application/xml; application/xml-dtd for DTD files and modules, including character entity sets. The RFC has further suggestions for the use of the +xml media type suffix for identifying ancillary files such as XSLT (application/xslt+xml). If you run scripts generating XHTML which you wish to be treated as XML rather than HTML, they may need to be modified to produce the relevant Document Type Declaration as well as the right media type if your application requires them to be validated. <淘宝热门商品:

小小豆叮

Why is XML such an important development?

It removes two constraints which were holding back Web developments: dependence on a single, inflexible document type (HTML); the complexity of full SGML, whose syntax allows many powerful but hard-to-program options. XML simplifies the levels of optionality in SGML, and allows the development of user-defined document types on the Web. <淘宝热门商品:

小小豆叮

What is HTML?

HTML is the HyperText Markup Language (RFC 1866), a small application of SGML used on the Web. It defines a very simple class of report-style documents, with section headings, paragraphs, lists, tables, and illustrations, with a few informational and presentational items, and some hypertext and multimedia. See the question on extending HTML. There is also an XML version of HTML. <淘宝热门商品:

小小豆叮

Aren't XML, SGML, and HTML all the same thing?

Not quite; SGML is the mother tongue, and has been used for describing thousands of different document types in many fields of human activity, from transcriptions of ancient Irish manuscripts to the technical documentation for stealth bombers, and from patients' clinical records to musical notation. SGML is very large and complex, however, and probably overkill for most common applications. XML is an abbreviated version of SGML, to make it easier for you to define your own document types, and to make it easier for programmers to write programs to handle them. It omits all the options, and most of the more complex and less-used parts of SGML in return for the benefits of being easier to write applications for, easier to understand, and more suited to delivery and interoperability over the Web. But it is still SGML, and XML files may still be processed in the same way as any other SGML file (see the question on XML software). HTML is just one of the SGML or XML applications, the one most frequently used in the Web. Technical readers may find it more useful to think of XML as being SGML-- rather than HTML++. <淘宝热门商品:

小小豆叮

Can I (and my authors) still use client-side ...

Can I (and my authors) still use client-side inclusions? The same rule applies as for server-side inclusions, so you need to ensure that any embedded code which gets passed to a third-party engine (eg calls to SQL, Java, LiveWire, etc) does not contain any characters which might be misinterpreted as XML markup (ie no angle brackets or ampersands). Either use a CDATA marked section to avoid your XML application parsing the embedded code, or use the standard <, and & character entity references instead. <淘宝热门商品:

小小豆叮

Why do we need all this SGML stuff? Why not jus...

Why do we need all this SGML stuff? Why not just use Word or Notes? Information on a network which connects many different types of computer has to be usable on all of them. Public information cannot afford to be restricted to one make or model or manufacturer, or to cede control of its data format to private hands. It is also helpful for such information to be in a form that can be reused in many different ways, as this can minimize wasted time and effort. Proprietary data formats, no matter how well documented or publicized, are simply not an option: their control still resides in private hands and they can be changed or withdrawn arbitrarily without notice. SGML is the international standard for defining this kind of application, but those who need an alternative based on different software for other purposes are entirely free to implement similar services using such a system, especially if they are for private use. <淘宝热门商品:

小小豆叮

Why not just carry on extending HTML?

HTML is already overburdened with dozens of interesting but incompatible inventions from different manufacturers, because it provides only one way of describing your information. XML allows groups of people or organizations to create their own customized markup applications for exchanging information in their domain (music, chemistry, electronics, hill-walking, finance, surfing, petroleum geology, linguistics, cooking, knitting, stellar cartography, history, engineering, rabbit-keeping, mathematics, genealogy, etc). HTML is at the limit of its usefulness as a way of describing information, and while it will continue to play an important role for the content it currently represents, many new applications require a more robust and flexible infrastructure. <淘宝热门商品:

小小豆叮

How do I create my own DTD?

You need to use the XML Declaration Syntax (very simple: declaration keywords begin with It says that there shall be an element called Shopping-List and that it shall contain elements called Item: there must be at least one (that's the plus sign) but there may be more than one. It also says that the Item element may contain parsed character data (PCDATA, ie text). Because there is no other element which contains Shopping-List, that element is assumed to be the `root' element, which encloses everything else in the document. You can now use it to create an XML file: give your editor the declarations: (assuming you put the DTD in that file). Now your editor will let you create files according to the pattern: Chocolate Sugar Butter It is possible to develop complex and powerful DTDs of great subtlety, but for any significant use you should learn more about document systems analysis and document type design. See for example Developing SGML DTDs by Maler and el Andaloussi, Prentice Hall, 1997, 0-13-309881-8, which was written for SGML, but perhaps 95% of it applies to XML as well, as XML is much simpler than full SGML -- see the list of restrictions which shows what has been cut out. <淘宝热门商品:

小小豆叮

What is XML for?

XML is intended `to make it easy and straightforward to use SGML on the Web: easy to define document types, easy to author and manage SGML-defined documents, and easy to transmit and share them across the Web.' It defines `an extremely simple dialect of SGML which is completely described in the XML Specification. The goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML.' `For this reason, XML has been designed for ease of implementation, and for interoperability with both SGML and HTML' [Quotes are from the XML specification]. XML is not just for Web pages: it can be used to store any kind of structured information, and to enclose or encapsulate information in order to pass it between different computing systems which would otherwise be unable to communicate. <淘宝热门商品:

小小豆叮

Where can I discuss implementation ...

Where can I discuss implementation and development of XML? The two principal online media are the Usenet newsgroups and the mailing lists. The newsgroups are comp.text.xml and to a certain extent comp.text.sgml. Ask your Internet Provider how to access these, or use a Web interface like Google. The general-purpose mailing list for public discussion is XML-L: to subscribe, visit the Web site and click on the link to join. You can also access the XML-L archives from the same URL. For those developing components for XML there is an xml-dev mailing list. You can subscribe by sending a 1-line mail message to xml-dev-request@lists.xml.org saying just SUBSCRIBE. The xml-dev archives are at OASIS http://lists.xml.org/archives/xml-dev/. Note that this list is for those people actively involved in developing resources for XML. It is not for general information about XML (see this FAQ and other sources) or for general discussion about XML implementation and resources (see below). There is a list for discussing XSL, the stylesheet language: XSL-List. For details of how to subscribe, see http://www.mulberrytech.com/xsl/xsl-list. Andrew Watt writes that there is a mailing list specifically for XSL-FO only, on eGroups.com. You can subscribe by sending a message to XSL-FO-subscribe@egroups.com. When you join a mailing list you will be sent details of how to use it. Please Read The Fine Documentation because it contains important information, particularly about what to do if your company or ISP changes your email address. Please note that there is a lot of inaccurate and misleading information published in print and on the Web about subscribing to mailing lists. Don't guess: read the documentation. Mailing lists in other languages Gianni Rubagotti writes: A new Italian mailing list about XML is born: to subscribe, send a mail message without a subject line but with text saying subscribe XML-IT to majordomo@ananas.usr.dsi.unimi.it. Everyone, Italian or not, who wants to debate about XML in our tongue is welcome. JP Theberge writes: A French mailing list about XML has been created. To subscribe, send subscribe to xml-request@trisome.com. Jarno Elovirta writes: a Finnish mailing list about XML has been set up. To subscribe, send an email to majordomo@evitech.fi with subscribe XML-Fin in the message body. The list is also hypermailed for online reference at http://users.evitech.fi/lists/xml-fin/. <淘宝热门商品:

小小豆叮

I'm trying to understand the XML Spec: why doe....

I'm trying to understand the XML Spec: why does XML have such difficult terminology? For implementation to succeed, the terminology needs to be precise. Design goal eight of the specification tells us that `the design of XML shall be formal and concise' . To describe XML, the specification therefore uses formal language drawn from several fields, specifically those of text engineering, international standards and computer science. This is often confusing to people who are unused to these disciplines because they use well-known English words in a specialised sense which can be very different from their common meanings -- for example: grammar, production, token, or terminal. The specification does not explain these terms because of the other part of the design goal: the specification should be concise. It doesn't repeat explanations that are available elsewhere: it is assumed you know this and either know the definitions or are capable of finding them. In essence this means that to grok the fullness of the spec, you do need a knowledge of some SGML and computer science, and have some exposure to the language of formal standards. Sloppy terminology in specifications causes misunderstandings and makes it hard to implement consistently, so formal standards have to be phrased in formal terminology. This FAQ is not a formal document, and the astute reader will already have noticed it refers to `element names' where `element type names' is more correct; but the former is more widely understood. Those new to the terminology may find it useful to read something like the Gentle Introduction to SGML chapter of the TEI Guidelines. Thanks to Bob DuCharme for suggestions and some bits from his book on the XML Spec. <淘宝热门商品:

小小豆叮

What do I have to do to use XML?

For the average user of the Web, nothing except use a browser which works with XML (see the question about browsers). Remember some XML components are still being implemented, so some features are still either undefined or have yet to be written. Don't expect everything to work yet! You can use XML browsers to look at some of the stable XML material, such as Jon Bosak's Shakespeare plays and the molecular experiments of the Chemical Markup Language (CML). There are some more example sources listed at http://xml.coverpages.org/xml.html#examples, and you will find XML (particularly in the disguise of XHTML) being introduced in places where it won't break older browsers. If you want to start preparations for creating your own XML files, see the questions in the Authors' Section and the Developers' Section. <淘宝热门商品:

小小豆叮

Do I have to know HTML or SGML before I learn XML?

No, although it's useful because a lot of XML terminology and practice derives from 15 years' experience of SGML. Be aware that `knowing HTML' is not the same as `understanding SGML' . Although HTML was written as an SGML application, browsers ignore most of it (which is why so many useful things don't work), so just because something is done a certain way in HTML browsers does not mean it's correct, least of all in XML. <淘宝热门商品:

小小豆叮

Where can I get an XML browser?

Remember the XML specification is still relatively new, so a lot of what you see now is experimental, and because the potential number of different XML applications is unlimited, no single browser can be expected to handle 100% of everything. Some of the generic parts of XML (eg parsing, tree management, searching, formatting, etc) are being combined into general-purpose libraries or toolkits to make it easier for developers to take a consistent line when writing XML applications. Such applications can then be customized by adding semantics for specific markets, or using languages like Java to develop plugins for generic browsers and have the specialist modules delivered transparently over the Web. MSIE5.5 handles XML but currently still renders it via the HTML model. Microsoft were also the architects of a hybrid (invalid) solution (islands) in which you could embed fragments of XML in HTML files because current HTML-only browsers simply ignored element markup which they didn't recognize, but his has now been superseded by XHTML. MSIE includes an implementation of an obsolete draft of XSLT (WD-xsl): you need to upgrade it and replace the parser (see http://www.netcrucible.com/ for details). The publicly-released Netscape code (Mozilla) and the almost indistinguishable Netscape 6 (there is no v5) have XML/CSS support, based on James Clark's expat XML parser, and this seems to be more robust, if less slick, than MSIE. Mozilla 0.9 is reported to have some XSLT capability. The authors of the former MultiDoc Pro SGML browser, CITEC, joined forces with Mozilla to produce a multi-everything browser called DocZilla, which reads HTML, XML, and SGML, with XSL and CSS stylesheets. This runs under NT and Linux and is currently still in the alpha stage. See http://www.doczilla.com for details. This is by far the most ambitious browser project, and is backed by solid SGML expertise, but seems to be rather a long time coming. Opera now supports XML and CSS on MS-Windows and Linux and is the most complete implementation so far. The browser size is tiny by comparison with the others, but features are good and the speed is excellent, although the earlier slavish insistence on mimicking everything Netscape did, especially the bugs, still shows through in places. See also the notes on software for authors and developers, and the more detailed list on the XML pages in the SGML Web site at http://xml.coverpages.org/. <淘宝热门商品:

小小豆叮

What does an XML document look like inside?

The basic structure is very similar to most other applications of SGML, including HTML. XML documents can be very simple, with no document type declaration (DTD), and straightforward nested markup of your own design: Hello, world! Stop the planet, I want to get off! Or they can be more complicated, with a DTD specified (see the question on document types), and maybe an internal subset (local DTD changes in [square brackets]), and a more complex structure: ]> Hello, world! Vitam capias Or they can be anywhere between: a lot will depend on how you want to define your document type (or whose you use) and what it will be used for <淘宝热门商品:

小小豆叮

Is there a Developer's API kit for XML?

Several are available or under development, such as SAX. Details of these and other XML software are held on the XML Web pages. The established conversion and application development engines like Balise, Omnimark, and SGMLC all have XML capability and they all provide APIs. Details of XML software of all kinds is on the XML Web pages. <淘宝热门商品:

小小豆叮

Where do I find more information about XML?

Online, there's the XML Specification and ancillary documentation available from the W3C; Robin Cover's SGML/XML Web pages with an extensive list of online reference material and links to software; and a summary and condensed FAQ from Tim Bray. The items listed below are the ones I have been told about. Please mail me if you come across others. An annual XML Conference is run by the Graphic Communications Association. XML 2001 is in Orlando, Florida, on December 9-14. See the GCA's Web site for details. The Extreme Markup Languages 2001 conference takes place on 12-17 August at Le Centre Sheraton, Montréal, Canada. The annual XML Summer School takes place in Oxford on 20-25 July 2001. There are many other XML events around the world: most of them announced on the mailing lists and newsgroups. There are lists of books, articles, and software for XML in Robin Cover's SGML and XML Web pages. That site should always be your first port of call: please look there first before using the form in this FAQ to ask about software or documentation. <淘宝热门商品:

小小豆叮

Which parts of an XML document are case-sensitive?

All of it, both markup and text. This is significantly different from HTML and most other SGML applications. It was done to allow markup in non-Latin-alphabet languages and to obviate problems with case-folding in scripts which are caseless. Element type names are case-sensitive: you must stick with whatever combination of upper- or lower-case you use to define them (either by first usage or in a DTD). So you can't say ...: upper- and lower-case must match; thus and are two different element types; For well-formed files with no DTD, the first occurrence of an element type name defines the casing; Attribute names are also case-sensitive, on a per-element basis: for example and in the same file exhibit two separate attributes, because the different casings of width and WIDTH distinguish them; Attribute values are also case-sensitive. CDATA values (eg HRef="MyFile.SGML") always have been, but ID and IDREF attributes are now case-sensitive as well; All entity names (Á), and your data content (text), are case-sensitive as always <淘宝热门商品:

小小豆叮

What's a Document Type Definition (DTD) and ...

What's a Document Type Definition (DTD) and where do I get one? A DTD is a formal description in XML Declaration Syntax of a particular type of document. It sets out what names are to be used for the different types of element, where they may occur, and how they all fit together. For example, if you want a document type to be able to describe Lists which contain Items, the relevant part of your DTD might contain something like this: This defines a list as an element type containing one or more items (that's the plus sign); and it defines items as element types containing just plain text (Parsed Character Data or PCDATA). Validating parsers read the DTD before they read your document so that they can identify where every element type ought to come and how each relates to the other, so that applications which need to know this in advance (most editors, search engines, navigators, databases) can set themselves up correctly. The example above lets you create lists like: ChocolateMusicSurfing How the list appears in print or on the screen depends on your stylesheet: you do not normally put anything in the XML to control formatting like you had to do with HTML before stylesheets. This way you can change style easily without ever having to edit the document itself. A DTD provides applications with advance notice of what names and structures can be used in a particular document type. Using a DTD when editing files means you can be certain that all documents which belong to a particular type will be constructed and named in a consistent and conformant manner. DTDs are less important for processing documents already known to be well-formed, but they are still needed if you want to take advantage of XML's special attribute types like the built-in ID/IDREF cross-reference mechanism. There are thousands of DTDs already in existence in all kinds of areas (see the SGML/XML Web pages for pointers). Many of them can be downloaded and used freely; or you can write your own (see the question on creating your own DTD. Existing SGML DTDs need to be converted to XML for use with XML systems: read the question on converting SGML DTDs to XML, and expect to see announcements of popular DTDs becoming available in XML format. <淘宝热门商品:

小小豆叮

How does XML handle white-space in my documents?

The SGML rules regarding white-space have been changed for XML. All white-space, including linebreaks, TAB characters, and regular spaces, even between those elements where no text can ever appear, is passed by the parser unchanged to the application (browser, formatter, viewer, converter, etc), identifying the context in which the white-space was found (element content, data content, or mixed content). This means it is the application's responsibility to decide what to do with such space, not the parser's: insignificant white-space between structural elements (space which occurs where only element content is allowed, ie between other elements, where text data never occurs) will get passed to the application (in SGML this white-space gets suppressed, which is why you can put all that extra space in HTML documents and not worry about it. This is not so in XML); significant white-space (space which occurs within elements which can contain text and markup mixed together, usually mixed content or PCDATA) will still get passed to the application exactly as under SGML. It is the application's responsibility to handle it correctly. My title for Section 1.

text

The parser must inform the application that white-space has occurred in element content, if it can detect it. (Users of SGML will recognize that this information is not in the ESIS, but it is in the grove.) In the above example, the application will receive all the pretty-printing linebreaks, TABs, and spaces between the elements as well as those embedded in the chapter title. It is the function of the application, not the parser, to decide which type of white-space to discard and which to retain. <淘宝热门商品:

小小豆叮

If XML is just a subset of SGML, can I use....

If XML is just a subset of SGML, can I use XML files directly with existing SGML tools? Yes, provided you use up-to-date SGML software which knows about the WebSGML Adaptations to ISO 8879 (the features needed to support XML, such as the variant form for EMPTY elements; some aspects of the SGML Declaration such as NAMECASE GENERAL NO; multiple attribute token list declarations, etc). An alternative is to use an SGML DTD to let you create a fully-normalised SGML file, but one which does not use empty elements; and then remove the DocType Declaration so it becomes a well-formed DTDless XML file. Most SGML tools now handle XML files well, and provide an option switch between the two standards. (see the pointers in the question on software). <淘宝热门商品:

小小豆叮

Does XML let me make up my own tags?

No, it lets you make up names for your own elements. If you think tags and elements are the same thing you are already in trouble: read the rest of this question carefully. Before we start this one, Bob DuCharme notes: Don't confuse the term `tag' with the term `element' . They are not interchangeable. An element usually contains two different kinds of tag: a start-tag and an end-tag, with text or more markup between them. XML lets you decide which elements you want in your document and then indicate your element boundaries using the appropriate start- and end-tags for those elements. Each red is a complete instance of the color element. is only the start-tag of the element, showing where it begins; it is not the element itself. Empty elements are a special case that may be represented either as a pair of start- and end-tags with nothing between them (eg ) or as a single empty element start-tag that has a closing slash to tell the parser `don't go looking for an end-tag to match this' (eg ). [Bob DuCharme] <淘宝热门商品:

小小豆叮

Can XML use non-Latin characters?

Yes, the XML Specification explicitly says XML uses ISO 10646, the international standard 31-bit character repertoire which covers most human (and some non-human) languages. This is currently congruent with Unicode and is planned to be superset of Unicode. The spec says (2.2): `All XML processors must accept the UTF-8 and UTF-16 encodings of ISO 10646...' . UTF-8 is an encoding of Unicode into 8-bit characters: the first 128 are the same as ASCII, the rest are used to encode the rest of Unicode into sequences of between 2 and 6 bytes. UTF-8 in its single-octet form is therefore the same as ISO 646 IRV (ASCII), so you can continue to use ASCII for English or other unaccented languages using the Latin alphabet. Note that UTF-8 is incompatible with ISO 8859-1 (ISO Latin-1) after code point 126 decimal (the end of ASCII). UTF-16 is an encoding of Unicode into 16-bit characters, which lets it represent the next two planes. UTF-16 is incompatible with ASCII because it uses two 8-bit bytes per character. `...the mechanisms for signalling which of the two are in use, and for bringing other encodings into play, are [...] in the discussion of character encodings.' The XML Specification explains how to specify in your XML file which coded character set you are using. Use of UCS-4 can only legally be specified in SGML or XML when the WebSGML Adaptations to ISO 8879 are implemented: this enables numbers longer than eight digits to be used in the SGML Declaration. `Regardless of the specific encoding used, any character in the ISO 10646 character set may be referred to by the decimal or hexadecimal equivalent of its bit string' : so no matter which character set you personally use, you can still refer to specific individual characters from elsewhere in the encoded repertoire by using &#dddd; (decimal character code) or &#xHHHH; (hexadecimal character code, in uppercase). The terminology can get confusing, as can the numbers: see the ISO 10646 Concept Dictionary. Rick Jelliffe has XML-ized the ISO character entity sets. Mike Brown's encoding information at http://skew.org/xml/tutorial/ is a very useful explanation of the need for correct encoding. There is an excellent online database of glyphs and characters in many encodings from the Estonian Language Institute server at http://www.eki.ee/letter/. <淘宝热门商品:

小小豆叮

How do I control appearance?

In HTML, default styling is built into the browsers because the tagset of HTML is predefined and hardwired into browsers. IN XML, where you can define your own tagset, browsers cannot know what names you are going to use and what they will mean, so you need a stylesheet if you want to display the formatted text. Browsers which read XML will accept and use a CSS stylesheet at a minimum, but you can also use the more powerful XSLT stylesheet language to transform your XML into HTML -- which browsers, of course, already know how to display (and that HTML can still use a CSS stylesheet). As with any system where files can be viewed at random by arbitrary users, the author cannot know what resources (such as fonts) are on the user's system, so the same care is needed as with HTML using fonts. To invoke a stylesheet from an XML file, include one of the stylesheet declarations: The Cascading Stylesheet Specification (CSS) provides a simple syntax for assigning styles to elements, and has been implemented in most browsers. The Extensible Stylesheet Language (XSL) has been created for use specifically with XML. Dave Pawson maintains a comprehensive FAQ at http://www.dpawson.co.uk/xsl/xslfaq.html. XSL uses XML syntax (an XSL stylesheet is an XML file) and has widespread support from several major vendors (see the questions on browsers and other software) although current browser support is limited. XSL comes in two flavours: XSL itself, which is a pure formatting language, and which needs a text formatter like FOP or PassiveTeX to create printable output (both can produce PDF). Currently I am not aware of any Web browsers which support XSL rendering; XSLT (T for Transformation), which is a language to specify transformations of XML into HTML either inside the browser or at the server before transmission. It can also specify transformations from one vocabulary of XML to another, and from XML to plaintext. Currently only MS Internet Explorer 5.5 handles XSLT inside the browser (and even that needs some post-installation surgery to remove the obsolete WD-xsl and replace it with the current XSL-Transform processor). But there is a growing use of server-side processors like Cocoon, which let you store your information in XML but serve it auto-converted to HTML, thus allowing the output to be used by any browser. XSLT is also widely used to transform XML into non-SGML formats for input to other systems (for example to transform XML into LaTeX for typesetting. <淘宝热门商品:

小小豆叮

How do I upload or download XML to/from a database

Ask your database manufacturer: they all provide XML import and export modules. In some trivial cases there will be a 1:1 match between field and element types; in most cases some programming is required to establish the matches, but this can usually be stored as a procedure so that subsequent uses are simply commands or calls with the relevant parameters.

Users from a database or computer science background should be aware that XML is not a database management system: it is a text markup system. While there are many similarities, some of the concepts of one are simply non-existent in the other: XML does not possess some database-like features in the same way that databases do not possess markup-like ones. It is a common error to believe that XML is a DBMS like http://www.oracle.com','yellow')" onMouseOut="kill()" >Oracle or Access and therefore possesses the same facilities. It doesn't. [PF]

<淘宝热门商品:

小小豆叮

How will XML affect my document links?

The linking abilities of XML systems are much more powerful than those of HTML, so you'll be able to do much more with them. Existing HREF-style links will remain usable, but the new linking technology is based on the lessons learned in the development of other standards involving hypertext, such as TEI and HyTime, which let you manage bidirectional and multi-way links, as well as links to a span of text (within your own or other documents) rather than to a single point. These features have been available to SGML users for many years, so there is considerable experience and expertise available in using them. The XML Linking Specification (XLink) and XML Extended Pointer Specification (XPointer) documents contain a detailed draft specification. An XML link can be either a URL or a TEI-style Extended Pointer (XPointer), or both. A URL on its own is assumed to be a resource; if an XPointer or XLink follows it, it is assumed to be a sub-resource of that URL; an XPointer on its own is assumed to apply to the current document (all exactly as with HTML). An XLink is always preceded by one of #, ?, or |. The # and ? mean the same as in HTML applications; the | means the sub-resource can be found by applying the link to the resource, but the method of doing this is left to the application. An XPointer can only follow a #. The TEI Extended Pointer Notation (EPN) is much more powerful than the fragment address on the end of some URLs, as it allows you to specify the location of a link end using the structure of the document as well as (or in addition to) known, fixed points like IDs. For example, the linked second occurrence of the word `XPointer' two paragraphs back could be referred to as http://www.ucc.ie/xml/faq.sgml#ID(hypertext).child(2,*).child(2,#element,'p').child(3,#element,'link'), meaning the third link element within the second paragraph within the second object in the element whose ID is hypertext (this question). Count the objects from the start of this question in the XML source (which has the ID hypertext): the first child object is the title of the question (); the second child object is the answer (the element); within the element go to the second paragraph; count to the third link. David Megginson has produced an xpointer function for Emacs/psgml which will deduce an XPointer for any location in an XML document. <淘宝热门商品:

Is there an XML version of HTML?

The W3C has released XHTML as `a reformulation of HTML 4 in XML 1.0' . This specification defines HTML as an XML application, and provides three DTDs corresponding to the ones defined by HTML 4.0. The semantics of the elements and their attributes are as defined in the W3C Recommendation for HTML 4.0. These semantics provide the foundation for future extensibility of XHTML. Compatibility with existing HTML user agents is possible by following a small set of guidelines <淘宝热门商品:

小小豆叮

I've already got SGML DTDs: how do I convert them

There are numerous projects to convert common or popular SGML DTDs to XML format (for example the TEI DTD, both Lite and full versions). The following checklist comes courtesy of Seán McGrath (author of XML By Example, Prentice Hall, 1998): No equivalent of the SGML Declaration. So keywords, character set etc are essentially fixed; Tag mimimization is not allowed, so becomes and becomes ; #PCDATA must only occur at the extreme left (ie first) in an OR model, eg (in SGML) becomes , and is illegal; No CDATA, RCDATA elements [declared content]; Some SGML attribute types are not allowed in XML eg NUTOKEN; Some SGML attribute defaults are not allowed in XML eg CONREF; Comments cannot be inline to declarations like ; A whole bunch of SGML optional features are not present in XML: all forms of tag minimization (OMITTAG, DATATAG, SHORTREF, etc); Link Process Definitions; Multiple DTDs per document; and many more: see http://www.w3.org/TR/NOTE-sgml-xml-971215 for the list of bits of SGML that were removed for XML; And [nearly] last but not least, CONCUR! There are some important differences between the internal and external subset portion of a DTD in XML: Marked Sections can only occur in the external subset; and Parameter Entities must be used to replace entire declarations in the internal subset portion of a DTD, eg the following is invalid XML: ]> <淘宝热门商品:

小小豆叮

Where's the spec?

Developers and Implementors (including WebMasters and server operators) Right here (http://www.w3.org/TR/REC-xml). Includes the EBNF. There are also versions in Japanese (http://www.fxis.co.jp/DMS/sgml/xml/); Spanish (http://www.ucc.ie/xml/faq-es.html); Korean (http://xml.t2000.co.kr/faq/index.html) and a Java-ised annotated version at http://www.xml.com/axml/testaxml.htm. Eve Maler maintains the DTD used for the spec itself; the DTD is also to encode several other W3C specifications, such as XLink, XPointer, DOM, XML Schema, etc. There is documentation available for the DTD. Note that the XML spec needs to use a special one-off version of the DTD, since the real original DTD used for it has long since been lost. <淘宝热门商品:

小小豆叮

Can I use Java, ActiveX, etc in XML files?

This will depend on what facilities the browser makers implement. XML is about describing information; scripting languages and languages for embedded functionality are software which enables the information to be manipulated at the user's end, so these languages do not have any place in an XML file, but in stylesheets like XSL and CSS. XML itself provides a way to define the markup needed to implement scripting languages: as a neutral standard it neither encourages not discourages their use, and does not favour one language over another, so the field is wide open. <淘宝热门商品:

小小豆叮

What's a namespace?

Randall Fowle writes: A namespace is a collection of element and attribute names identified by a Uniform Resource Identifier reference. The reference appears in the root element as a value of the xmlns attribute. For example, the namespace reference for an XML document with a root element x might appear like this: . More than one namespace may appear in a single XML document, to allow a name to be used more than once. Each reference can declare a prefix to be used by each name, so the previous example might appear as , which would nominate the namespace for the `spc' prefix: Mr. Big. The reference does not need to be a physical file; it is simply a way to distinguish between namespaces. The reference should tell a person looking at the XML document where to find definitions of the element and attribute names using that particular namespace. <淘宝热门商品:

小小豆叮

I keep hearing about alternatives to DTDs. Wh ....

I keep hearing about alternatives to DTDs. What's a schema? A DTD is for specifying the structure (only) of an XML file: it gives the names of the elements, attributes, and entities that can be used, and how they fit together. Because DTDs were designed for use with traditional text documents, they have no mechanism for defining the content of elements in terms of data types, because XML has no data types: text is just text. A DTD therefore cannot be used to specify numeric ranges or to define limitations or checks on the text content, only on the markup that surrounds it. The XML Schema recommendation provides a means of specifying element content in terms of data types, so that document type designers can provide criteria for validating the content of elements as well as the markup itself. Schemas are written as XML files, thus avoiding the need for processing software to be able to read XML Declaration Syntax, which is different from XML Instance Syntax. Schemas are now a formal Recommendation, and a number of sites are serving useful applications as both DTDs and Schemas, eg http://www.schema.net and http://www.dtd.com. There is a separate Schema FAQ at http://www.schemavalid.com. The term `vocabulary' is sometimes used to refer to `DTDs and Schemas' together. Authors and publishers should note that the plural of Schema is Schemas: the use of the singular to do duty for the plural is a foible dear to the semi-literate; the use of the old (Greek) plural schemata is now unnecessary didacticism. Writers should also note that the plural of DTD is DTDs: there is no apostrophe. Bob DuCharme adds: Many XML developers were dissatisfied with the syntax of the markup declarations described in the XML spec for two reasons. First, they felt that if XML documents were so good at describing structured information, then the description of a document type's own structure (its schema) should be in an XML document instead of written with its own special syntax. In addition to being more consistent, this would make it easier to edit and manipulate the schema with regular document manipulation tools. Secondly, they felt that traditional DTD notation didn't allow document type designers the power to impose enough constraints on the data -- for example, the ability to say that a certain element type must always have a positive integer value, that it may not be empty, or that it must be one of a list of possible choices. This eases the development of software using that data because the developer has less error-checking code to write. <淘宝热门商品:

小小豆叮

What XML software can I use today?

Details are no longer listed in this FAQ as they are now changing too rapidly to be kept up to date: see the XML Web pages at http://xml.coverpages.org/ and watch for announcements on the mailing lists and newsgroups. For a detailed guide to some examples of XML programs and the concepts behind them, see the editor's book Understanding SGML and XML Tools (Kluwer, 1998, 0-7923-8169-6). An important distinction is evident between the two major classes of XML application: `document' and `data' , and this is reflected especially in editing and development software. Document-style applications are in the nature of traditional publishers' work: text and images in a structured environment, with fonts and formatting. Data-style applications are found in e-commerce, with XML being used as a container for information being passed between systems, usually unformatted and unseen by humans. While in theory it would be possible to use a data-class editor to write a novel, or a document-class editor to create invoices, specialist users in both classes should remain aware that the other applications do exist. For browsers see the question on XML Browsers and the details of the xml-dev mailing list for software developers. Bert Bos keeps a list of some XML developments in bison, flex, perl and Python. Information for developers of Chinese XML systems can be found at the Chinese XML Now! website of Academia Sinica: http://www.ascc.net/xml/ This site includes an FAQ and test files. <淘宝热门商品:

小小豆叮

D.16 What's the story on XML and EDI?

Electronic Data Interchange has been used in e-commerce for many years to exchange documents between commercial partners to a transaction. It has required special proprietary software, but there are now moves to enable EDI documents to travel inside XML. Details of developments are at http://www.xmledi.com/ and there is a guideline document at http://www.geocities.com/WallStreet/Floor/5815/guide.htm. <淘宝热门商品:

小小豆叮

网络双星-XML与Java技术

XML与Java技术完美地互补,为开发者创造了一个可能性的新世界
Jon Byous
翻译:Frank Gu(guxf@bigfoot.com)

XML--可扩展标记语言--被吹捧为自Java技术横空出世以来Internet应用领域最大的新闻。
很难想象比它们两者更为互补的技术了:Java平台提供了在网络上安全而方便地传播代码的基础,XML技术则为数据提供了同样的能力,一种清晰地,平台独立地表示内容的方法。
1998年2月10日WWW协会(W3C)发布了XML1.0标准。从那时起,XML技术作为一种网络系统中通用的数据交换格式迅速得到了支持。(使用XML)的实际的好处有:

结构化--建立有任何复杂层次的数据模型。
可扩展性--根据需要定义新的标志。
验证--检查数据在结构上的正确性。
独立与媒介--以多中方式发布内容。
独立于供应商和平台--使用标准的商业软件甚至文本工具处理任何符合(XML标准)的文档。
针对XML技术的Java技术标准扩展
SUN通过Java平台支持XML技术,并正领导着为XML定义Java技术标准扩展的努力。它将通过Java Community Process的业界参与者来开发,以确保稳定性和兼容性。企业可以信赖XML标准扩展来获得与Java平台的高质量的集成。
第一步是通过XML标准扩展提供基础功能,包括读,维护和生成XML文本。这些核心功能将形成开发全功能的,基于XML技术的应用程序的构造块。
XML标准扩展将由一个规范,一个参考实现和一个兼容性测试工具组成。根据SUN关于对开放过程和工业标准承诺,XML标准扩展将顺从XML 1.0规范,并充分利用已经为XML技术开发的Java API,包括W3C DOM Level 1 核心建议和SAX 1.0 API.
根据波士顿Patricia Seybold Group的资深顾问Anne Thomas的介绍,这个标准扩展是向前迈出的一大步:“针对XML的Java平台标准扩展将提供生成和处理XML的标准类,并且,因为是标准扩展,这些类将在几乎所有的Java平台上提供。开发者不再需要自己开发这些类,并且XML文档不会显得很累赘,因为我们不需要在应用程序的代码中包含这些类。这些类将会驻留在目标系统中。”
企业平台支持
XML技术还会被使用在SUN Java企业平台的一些关键领域。Java 2平台企业版产品线经理Bill Roth指出:“XML是我们下一代企业计算平台:Java 2平台企业版计划的基础。我们将通过它来使Enterprise JavaBeans组件更便于使用。我们还将使它成为传送企业关键任务数据的标准。”
Sun已经宣布它正在将基于XML技术的标准扩展加入下一个版本的Enterprise JavaBeans架构,以响应客户对提高EJB组件的适用性的要求。(译者注:这里所说的是EJB 2.0,已经发布了。)
完美的组合:XML与Java技术
XML技术被期望给面向网络的应用带来革命性的影响,特别是在数据交换领域。Java 与XML一起使得在诸如电子商务和企业应用集成这样领域的新一代Web应用成为可能。
目前,几乎所有Internet技术的主要参与者都承诺支持XML技术。除了Sun以外,象IBM,http://www.oracle.com','yellow')" onMouseOut="kill()" >Oracle, Fujitsu, Novell, Webmethods, Ariba, Bluestone, CommerceOne, Vervet, NetPost等公司正在开发将XML和Java一起使用的产品和技术。
在Sun,这一新技术的最大支持者也许是Jon Bosak,他还是W3C XML协调组的主席,通常被认为是XML之父。Bosak说:“XML和Java是厂商独立程序的阴和阳。把它们集成在一起,你能获得完整的,平台独立的,基于Web的计算环境。"
"聪明的数据"
Patricia Seybold Group的Anne Thomas解释说:“把Java和XML技术组合在一起产生了轻便的‘聪明’的数据。XML提供了普遍适用的格式化的数据格式,同时Java技术提供了普遍适用的代码。因为用Java语言写的代码可以嵌入用XML语言写的文档中,我们可以创建包含自己的数据处理程序的数据结构。这是伟大的组合。“
Java平台确实是使用XML语言工作的开发人员的首选技术。例如,有很多解析器和通用工具是在Java平台上开发的。开发人员不仅发现Java语言的移植性和吸引人的面向对象特性,他们还被Java语言的效率所深深吸引。企业应用集成分析和顾问公司NC.Focus的总裁JP Morgenthal指出:”使用Java语言写他们的工具允许公司和开发人员更快地完成工作。同时,Java提供字符串处理,对哈希表,URL的支持,以及其它一些特性使它成为使用开发向XML这样的应用的自然工具。最后,共享代码确实容易,这是在这个快速发展的领域中非常重要的一个特性。“
这是一条双向路。利用它的元数据的灵活性性和数据移植性,XML给了Java巨大的帮助,使数据通过网络更加容易移植。Java技术为开发人员提供了相对C和C++的坚实的生产率提高。同时,XML和Java技术直接导致了平台独立的和基于标准的应用程序能被立即开发。
当具有在网络系统上交换信息的需要时,例如电子数据交换(EDI),电子商务,企业资源计划和工作流应用,XML和Java技术一起成为一种最适宜的选择。
可移植的采购定单
很多观察者相信,XML和Java技术一起将革新我们交换和处理信息的方式,我们将能在收到信息的同时使用建立在Java技术上的应用程序,根据我们自己的需要处理它。Sun的Bill Smith解释说:“XML技术使信息交换成为可能,而Java技术使自动处理更灵活。”Bill是WWW协会XML 连接工作小组的设计师。
例如,用XML语言描述的公司采购定单可以包含生动的成分,例如零件和客户编号,它们可以和数据库结合在一起,在不同的程序中自动更新仓库库存和出货记录而不需要重复输入数据。
在这个例子里,一份定单在不同的应用中可以有不同的含义。在采购部的人可能有权利赋予定单号,指定客户代码和修改金额,而供货方将只能证实它和修改金额,收货人只能查看,存储或打印这份文件。但是,在上述每一种情况下,实质上是同一份文档,基于同样的数据,根据不同的接受者,有不同的行为说明。
或者,同样数据的行为根据处理它的应用程序,甚至应用程序运行的设备的不同而改变。这意味着,举例来说,一个简单的股票市场的数据流可以运行在不同的应用程序中,可以是一个滚动的文本窗口,客户定制的图表或文字和图形混合的Web页面。
在文档管理和出版应用中,XML和Java技术可以提供某种突破,比如独立于媒体的出版,独立于设备的表示,客户端处理定制的数据和视图。
这是因为,与HTML文档依赖Web服务器端的CGI描述语言提供功能不同,XML与Java技术可以将更多的应用功能直接提供给客户设备来处理。这提高了用户在客户端对数据的掌握程度,同时又减少了网络处理和流量。

See Also
XML Technology Pages on java.sun.com
(http://java.sun.com/xml/)
Java Community Process pages on the Java Developer Connection
(http://java.sun.com/jdc/jcp/index.html)
XML and Java Technologies in the News - Search results from our Java Industry ConnectionSM site. (http://java.sun.com/industry/)
The SGML/XML Web Page by Robin Cover
(http://www.oasis-open.org/cover/sgml-xml.html)
Java Project X Technology Release 1 - code for XML technology services.
(http://java.sun.com/features/1999/03/xml-side1.html)
Managing Names and Ontologies: An XML Registry and Repository by Robin Cover
(http://www.sun.com/981201/xml/))
WDVL.com: The Web Developer's Virtual Library - XML Subsite
(http://www.wdvl.com/Authoring/Languages/XML/)
Tutorials for using the Java 2® platform and XML technology
(http://developerlife.com)
General XML info: Published by Seybold
(http://www.xml.com)
XML FAQ
(http://www.ucc.ie/xml/) <淘宝热门商品:

小小豆叮

Can I do mathematics using XML?

Yes, if the document type you use provides for math. The mathematics-using community is developing software, and there is a MathML Recommendation at the W3C, which is a native XML application. It would also be possible to make XML fragments from other DTDs, such as the long-expired HTML3, the near-obsolete HTML Pro, or ISO 12083 Math, or OpenMath, or one of your own making. Browsers which display some math embedded in SGML already exist (eg DynaText, Panorama, Multidoc Pro). <淘宝热门商品:

小小豆叮

How does XML handle metadata?

Because XML lets you define your own markup language, you can make full use of the extended hypertext features (see the question on Links) of XML to store or link to metadata in any format (eg ISO 11179, Dublin Core, Warwick Framework, Resource Description Framework (RDF), and Platform for Internet Content Selection (PICS)). There are no predefined elements in XML, because it is an architecture, not an application, so it is not part of XML's job to specify how or if authors should or should not implement metadata. You are therefore free to use any suitable method from simple attributes to the embedding of entire Dublin Core/Warwick Framework metadata records. Browser makers may also have their own architectural recommendations or methods to propose. <淘宝热门商品:

小小豆叮

Can I use Java to create or manage XML files?

Yes, any programming language can be used to output data from any source in XML format. There is a growing number of front-ends and back-ends for programming environments and data management environments to automate this. There is a large body of `middleware' written in Java and other languages for managing data either in XML or with XML output. There is a suite of Java tutorials (with source code and explanation) available at http://developerlife.com. Please do not mail the FAQ editor with questions about your Java programming bugs. Ask one of the Java newsgroups instead. <淘宝热门商品:

小小豆叮

How do I execute or run an XML file?

You can't and you don't. XML is not a programming language, so XML files don't `run' or `execute' . XML is a markup specification language and XML files are data: they just sit there until you run a program which displays them (like a browser) or does some work with them (like a converter which writes the data in another format, or a database which reads the data), or modifies them (like an editor). <淘宝热门商品:

小小豆叮

Can I still use server-side inclusions?

Yes, so long as what they generate ends up as part of an XML-conformant file (ie either valid or just well-formed). Server-side tag-replacers like shtml, PHP, JSP, ASP, Zope, etc store almost-valid files using comments, Processing Instructions, or non-XML markup, which gets replaced at the point of service by text or XML markup. It is unclear why some of these systems continue to use non-XML markup. There are also some XML-based preprocessors for formats like XVRL (eXtensible Value Resolution Language) which resolve specialised references to external data and output a normalised XML file. <淘宝热门商品:

小小豆叮

Is there a conformance test suite for XML processo

James Clark has a collection of test cases for testing XML parsers at http://www.jclark.com/xml/ which includes a conformance test.

Mary Brady, OASIS XML Conformance TC Chair, writes: A much larger and more comprehensive suite is the NIST/OASIS Conformance Test Suite, available from http://www.oasis-open.org/committees/xmltest/testsuite.htm, which contains contributions from James Clark, OASIS and NIST, Sun, and Fuji Xerox.

Carmelo Montanez writes: NIST has developed a number of XSLT/XPath tests, which will be part of the official OASIS XSLT/XPath suite (not yet released). These tests are available from our web site at http://xw2k.sdct.itl.nist.gov/xml/index.html (click on `XSL Testing' ). The expected output may be slightly different from one implementation to another. The OASIS XSLT technical committee has a solution for that problem, however our tests do not yet implement such solution. Please forward any comments to carmelo@nist.gov.

Jon Noring writes: For those who are interested, I took the current and complete Unicode 3.0 `cast' of characters and their hex codes, and created a simple XML document of it to test XML browsers for Unicode conformity. It is not finished yet -- I need to add comments and to fix the display of rtl characters (ie Hebrew, Arabic). It is found at: http://www.windspun.com/unicode-test/unicode.xml. It is quite large, almost 900K in size, so be prepared. IE5 renders many of the characters in this XML document -- and for the ones it does render it appears to do so correctly. I look forward to when Opera will do likewise. I haven't tested the current version of Mozilla/Netscape for Unicode conformity.

<淘宝热门商品:

小小豆叮

Which should I use in my DTD, attributes or elemen

There is no single answer to this: a lot depends on what you are designing the document type for. Traditional editorial practice is to put the real text (what would be printed) as character data content, and keep the metadata (information about the text) in attributes, from where they can more easily be isolated for analysis or special treatment like display in the margin or in a mouseover: PortiaThe quality of mercy is not strain'd, But from the systems point of view, there is nothing wrong with storing the data the other way round, especially where the volume of text data on each occasion is relatively small: 184 A lot will depend on what you want to do with the information and which bits of it are easiest accessed by each method. A rule of thumb for conventional text documents is that if the markup were all stripped away, the bare text should still be readable and usable, even if unformatted and inconvenient. For database output, however, or other machine-generated documents like e-commerce transactions, human reading may not be meaningful, so it is perfectly possible to have documents where all the data is in attributes, and the document contains no character data in content models at all. See http://xml.coverpages.org/elementsAndAttrs.html for more information. <淘宝热门商品:

小小豆叮

12 How does XML fit with the DOM?

The Document Object Model (DOM) (http://www.w3.org/TR/REC-DOM-Level-1) provides an abstract API for constructing, accessing, and manipulating XML and HTML documents. A binding of the DOM to a particular programming language provides a concrete API. Microsoft and other vendors provide APIs which let you use the DOM to query and manipulate XML documents in memory. The public Working Draft for the DOM Level 3 XPath is at http://www.w3.org/TR/2001/WD-DOM-Level-3-XPath-20010618/. <淘宝热门商品:

小小豆叮