Cover Pages Logo SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic

TaXML Presentation


TaXML Presentation

From:  http://www.taxadmin.org/fta/meet/2000tech/andrson/
"TaXML Presentation."
By Michael Sanzi and Lesley Anderson
September 12, 2000
Also in: 
http://www.taxadmin.org/fta/meet/2000tech/andrson/anderson.ppt


[Slide #1]

TaXML Presentation
     Lesley Anderson
     
[Slide #2]
Introduction 

     "Straw man" XML-based schema 

     The schema authors


Notes: 


     Straw man 

     This schema is presented as a straw man -- it is a starting point for working with XML. 

     The people working on this were all tax professionals with many years of experience. 

     Even though this has been carefully reviewed, there may be 
     inconsistencies in the examples. 

     This proposal may not exactly meet the requirements of the agencies that would use it,
     simply because there has not been sufficient (any!) communication between
     developers and customers. 

     The Dev Team 

     Tax professionals with many years experience 

     Worked for 10 years developing commercial tax software for professional use 

     Worked at MS for the last 2 years developing the TaxSaver product 

     XML experience provided by programmers at MS 
     
[Slide #3]
Agenda

     Developing a hierarchy 

     Creating the schema 

     Creating the XML data file 

     Validating the XML data file with the schema 

     Displaying data using XSL
     
[Slide #4]
Developing a Hierarchy

     Tax forms and electronic filing 

     Included fields that need data entered 

     Included data only once 

     Exceptions: key fields and placeholders

Notes: 


     Tax forms and electronic filing 

     Analyzed forms and flow, and reviewed the electronic filing specs 

     Looked for ways to logically group data, rather than just reiterate the forms 

     Looked for ways to economize the amount of data by avoiding duplication 

     Used descriptive tag names 

     Included entry fields 

     Entered fields on forms are included in the hierarchy 

     Subtotals and non-significant computed fields are not included 

     Included fields that would be computed on a worksheet or a form not supported 
     for this version. These are placeholders for a group of fields that would be entered
     later. 

     Included data only once 

     If data appeared in more than one place, only included data at its source 
     entry point. 

     For example, advance EIC is entered on the W-2 and then carried to the 1040.
     In our model, the field only shows on the W-2. 

[Slide #5]

Forms Supported

     Form 1040 

          Schedule A 
          Schedule B 
          Schedule C 
          Schedule E (pg.1) 
          Schedule EIC 
          Schedule F 
          Schedule H 
          Schedule J 
          Schedule R 
          Schedule SE

     Form 1040A 

          Schedule 1 
          Schedule 2 
          Schedule 3 

     Form 1040EZ 

     Form 2210-F 

     Form 2441 

     Form 4255 

     Form 4562 

     Form 4797 

     Form 4835 

     Form 8606

     Form 8615 

     Form 8812 

     Form 8815 

     Form 8828 

     Form 8829 

     Form 8839 

     Form 8863 

     Form 9465 

     Form W-2 

     Form 1099-INT, DIV, & MISC 

     Form 1099-R 

Notes: 


     It is somewhat misleading to speak in terms of forms. 

     It is more accurate to say that the fields with entered data, as contrasted 
     with computed data, are included in the schema.
[Slide #6]
At the Top of the Hierarchy

     TaXML 

          Authentication 
          Identification 
          KeyID 
          TaxYear 
          Version 

               Major 
               Minor 
          IndividualTax 
          CorporateTax 
          W-2 & W-3

Notes: 


     TaXML 

     XML schemas must have a unique root element. In this case, it's TaXML. 

     There would of course, need to be elements for authentication, signon, and 
     versioning. This is not addressed in this hierarchy except for a couple of 
     placeholders. As security and transmissions needs are determined, this part
     of the schema will be filled 

     Individual Tax is the top level for both federal and state income tax for individuals. 

     I'm showing Corporate Tax as an example of the level at which another entity
     would appear in the hierarchy. This could be included as an actual part of the schema,
     by including it through a namespace or a data island. 

     Also, this type of schema could be used to transmit information from 
     employers and financial institutions to the IRS.
     
[Slide #7]

The Taxpayer Element

     Taxpayer 

          IDNumber 
          Name 

               FirstName 
               MiddleInitial 
               LastName 
               Suffix 
               CompleteName 
               NameControl 
          Age65OrOlder

     Blind 
     MilitaryIndicator 
     HomePhone 
     WorkPhone 
     PresidentFund 
     Exemption

Notes: 


     The Taxpayer Element 

     ID number is the Social Security number. We have used the more general
     term IDNumber to allow this tag to be used for either SSN or EIN. 

     The taxpayer's name is comprised of the 6 pieces shown. We gather the
     first, middle initial, last name, and suffix separately to make name control and matching
     easier. The name is then displayed in full in the CompleteName field. 

     The NameControl represents the same content that is currently used in
     electronic filing. Whether this would still be needed in this new methodology is unclear. 

     Then we have pulled together other information that is specifically
     related to the taxpayer. 

     There are a couple of other fields that are in this section that 
     were deleted on this slide due to space constraints. 

     The spouse has an identical element.
     
[Slide #8]

Address

     Address 

          Street 
          Street2 
          ApartmentSuite 
          City 
          State 
          ZipCode 
          NewAddress
Notes: 


     Address 

     The hierarchy was developed looking at federal and California. If California 
     needed a piece of information that would be relevant to federal or another state, that field
     was added to the federal portion of the hierarchy. 

     An example of this is that Street2 is a second line for the street address. 

     The NewAddress tag is a boolean (Yes/No) field to indicate if this is a new address
     
[Slide #9]

FilingStatusInformation

     FilingStatusInformation 

          FilingStatus 
          MFS 

               Name 
               IDNumber 
               DidNotLiveWithSpouse 
          HeadofHousehold 

               Name 
               IDNumber 
          QWYearSpouseDied 
          MustItemizeIndicator


Notes: 


     FilingStatusInformation 

     FilingStatusInformation includes the filing status used in the return. 

     The MFS element was created to group the information needed for the spouse
     when the status is married filing separately. 

     Under the MFS element is the Name element. This is bolded to show that this
     element has been previously used. This simplifies the hierarchy by not needing
     to again list the components. 

     The Head of Household information is for the qualifying dependent. 
     
[Slide #10]

DependentList

     DependentList 

          Dependent 

               Name 
               IDNumber 
               Relationship 
               QualifyforTaxCredit 
               QualifiedCareExpense 
               YearofBirth 
               Student 
               Disabled 
               NumberOfMonths 
               PYChildCareIndicator

Notes: 


     DependentList 

     Here is the first example of dealing with an element that may occur more than 
     once, or in English, that may be present in the XML data file more than once. 

     Throughout this proposed schema, we have used the word "List" to indicate that 
     more than one of the elements that follows the list is allowed. 

     In short, here is what this hierarchy means: 
          One DependentList is allowed. 
          Within each DependentList, there can be an unlimited number of Dependent 
          elements. 
          Within each Dependent element, there can be only one of the items listed. 
          When we move to examine the schema you will be able to see that this is 
          accomplished using the minOccurs' and maxOccurs' attributes. 
          
[Slide #11]

Digging Into the Hierarchy

     Wages 

          Demonstrates adding levels to the hierarchy 
          Shows how state data can be gathered 

     ActivityList 

          Combining business, rental, farm, and farm rentals 
          Depreciation 

     California 

          Integrating state into the mix

Notes: 


     Digging Into the Hierarchy 

     We're going to be using a paper handout for this part of the presentation 
     because of space constraints on the screen. 

     We'll take a look at the Wages, ActivityList, and how California is included 
     in the hierarchy.
     
[Slide #12]
Creating the TaXML Schema

     XDR rather than DTD 

     Working in XML 

          Using a browser 
          Using an XML editor 

     Declaring the name space

Notes: 


     XDR vs DTD 

     There are several languages for writing schemas; XML Document Type Definitions 
     (DTD) is the only standard one and the most widely used today. 

     Microsoft's XML tools use a proprietary schema language called XML Data Reduced 
     (XDR), which makes a number of improvements over DTD. For example, XDR
     schemas can specify data types (e.g., decimal number, date) and allowable values 
     of elements, enabling p

     Microsoft says it will eventually replace XDR with the XML Schema language, 
     currently under development by the W3C with input from Microsoft and other
     companies. Microsoft also says it will provide automated tools for translating 
     XDR schemas to XML Schema

     We are using XDR in this schema. 

     Working in XML 

     Now we're going to look at XML. Up until now, the material we've looked at has 
     been the analysis and organization of the material, without actually being in an XML
     file or schema. 

     I'm opening the schema in the Internet Explorer. So technically what you're 
     looking at is an HTML representation of the schema, rather than pure XML. HTML adds
     the colors and allows you to expand or collapse parts of the schema to make 
     it easier to view. 

     Declaring the name space 

     The first line in the file is the declaration of the name space. We're using 
     the Microsoft versions of XML for both the schema and the datatypes. 
     
[Slide #13]

TaXML AttributeType

     AttributeType 

          tsj 
          state 
          keyfield 
          Format 

Notes: 


     AttributeTypes 

     The attribute feature allows you to add additional information to an element. This can
     be used to validate, sort, and display data. 

     TSJ 

     The tsj' attribute is probably one you can figure out with little help. Does 
     anyone know what this is? 

     By adding the tsj' attribute, we set the stage to be able to identify which 
     income belongs to the taxpayer, the spouse, or to them as a couple. This can be used to
     determine the optimal filing status, and is especially critical to many of the states. 

     State 

     Sourcing income to a particular state or city is integral to the preparation 
     of correct state returns. 

     Keyfield 

     The keyfield attribute might be better named something like check total. 
     
[Slide #14]
TaXML Data Types

     Data types 

          fixed.14.4 
          float 
          boolean 
          date 
          int 
          string


Notes: 


     TaXML Data Types 

     Fixed.14.4 

     Fixed.14.4 is the type used for money fields. 

     Float 

     Used for percentages. 

     Boolean 

     Used for Yes/No or True/False situations. 

     Date 

     Used for dates as long as a word such as various is not permitted. If it is,
     then must use the string type. 

     Int 

     Used for numeric fields that are not currency 

     String 

     Used for anything else! Text is obvious, but also for IDNumbers, calendar
     years, etc., where you want to be able to control format. 
     
[Slide #15]
ElementType Declarations

     Declaring the elements 

          Order 
          Format 

               Beginning < 
               ElementType 
               Name 
               Content 
               Dt:type 
               Ending />'


Notes: 


     Declaring the Elements 

     The methodology we used to generate this schema was to set up a database.
     This gave us a dependable, stable data source that was easily maintainable. A
     program was written using C++ that creates the schema. 

     Order 

     Each tag name that we use in the hierarchy must be declared in the schema
     using the ElementType' statement. 

     The ElementType declaration is best done for an item before it is included
     as a child of another element. 

     Another way to view this is that the elements that contain text, rather 
     than other elements are handled first. 

     Although the first half of the schema is in alphabetical order, this is 
     only because that was the choice of the programmer to alphabetize the 
     elements that did not have child elements. 

     The order in which a schema is written is not hard and fast. The 
     language is still growing. 

     Format 

     Each line begins with a less than sign. All XML schemas are XML files and 
     must conform to the required XML syntax. 

     Let's look at the AccountingMethod line as an example. 

     Note to self: Go to next slide. 
     
[Slide #16]
AccountingMethod Example


Notes: 


     Begin the line with the less than sign. 

     Next is ElementType to identify what is to be defined. 

     Next is name'. This is the tag name from the hierarchy. 

     Content must be declared using the content="XXX" format. 

     Dt:type="string, boolean, etc" is the format for data type. 

     The ending of the line is />. The slash is used to denote an empty element, 
     if you are familiar with that from your XML class. This simply means that 
     no data will be stored on this line. 

     Content 

     Content identifies the content of the element. The choices that XML allows
     are eltOnly, textOnly, and mixed. 

     textOnly' means that the element contains data. Although textOnly' sounds 
     like it might only hold string data, this will hold any kind of data. 

     eltOnly' means that this element can only contain other elements. An 
     example of this is the Name element within the Taxpayer element that we looked at early in the
     presentation. Name contains the elements FirstName and LastName that actually contain the 

     Mixed is not used in this schema. 
     
[Slide #17]
Building the Tree in XML

     The tree 

     Declaring elements that contain other elements 

     Example 


Notes: 


     Note to self: Search on TaXML to find the place in the file. 

     The Tree 

     The tree in an XML schema really just represents the parent-child 
     relationship between elements.

[Slide #18]
The XML Data File

     Creating the data file in "real life" 

          Schema under control of IRS 
          XML data files produced by 3rd party software 
          XML data files created by taxpayer entry on IRS web site 

     Typed in for this presentation 

     Demo of the XML file
Notes: 


     For this presentation, we simply typed the data into a well-formed XML 
     file and validated it against the schema. All that means is that if we entered data that did not
     match something in the schema, an error was displayed when we tried to view it in Inter

     In real life, the schema would be established by the IRS much the same as the 
     electronic filing record layouts and specifications are done currently. Third party
     software would then produce valid XML data files in the same way that they produce well forme

     In the near future, we hope to see XML data files produced by a taxpayer's 
     entries into the fields on an IRS web site. 

     Now let's take a look at an XML data file. I'm going to show you this file in 
     an XML editor instead of Internet Explorer just to give you another view of the world.
     Whether you use this tool or IE, the validations done are the same. 

     I should mention that we have used EF PATS files for our samples, so these may look familiar!
     
[Slide #19]
Sample XML Data File

     Identify the version 

     Include the schema to be used to validate this file 

     Data must be included between correctly named tags 

          Case sensitive 
          End tags 
          No overlap

Notes: 


     Version 

     The version is 1.0. To create a sample file of your own, just copy this line. 

     Schema 

     You must give the name (and path if it's in a different directory) of the schema. 

     Well-formed Rules 

     I'm sure you covered these rules in your class yesterday. 

     XML is case sensitive -- not a problem for those who grew up with UNIX. 

     You must have beginning and ending tags. 

     Unlike HTML, the tags may not overlap.
     
[Slide #20]

Validation of Data

     XML validates data against the schema and thus ensures a correctly formed file 

     As with our current electronic filing system, however, there will be a need for checking
     content 

     There would need to be calculations done with the XML data after transmission of the file to
     the IRS
     
[Slide #21]
Displaying Data With XSL

     XML data storage versus use of the data 

     XSL is a separate language 

          Very new so hard to find information 
          Uses XML syntax 

     XSL file

Notes: 


     XML Storage Vs Display 

     Perhaps the greatest advantage of using XML is the power to store 
     data separately from how you use the data. I'll show you a demo in 
     a minute that displays the same XML data file in two different formats. 

     XSL as a Language 

     Information is beginning to become available about XSL. My latest 
     check of the bookstores shows several good looking books due to be published this summer. 

     Since XSL uses the same syntax as XML, you don't have to learn another language. 
     
[Slide #22]

Summary

     Hierarchy 

     XML schema and XML data file 

     XSL 

     Questions?

Notes: 


     Hierarchy 

     We covered a possible data hierarchy that would reduce the amount of 
     data to be transmitted.
     
----------------------------

Prepared by Robin Cover for The XML Cover Pages archive.


Globe Image

Document URL: http://xml.coverpages.org/TaXML-PresentationPPT-TXT.html